Home

Obfuscation Revealed: Leveraging Electromagnetic Signals for Obfuscated Malware Classification

This repository contains documentation of code, datasets and models for the paper Obfuscation Revealed: Leveraging Electromagnetic Signals for Obfuscated Malware Classification published in ACSAC 2021.

Publications

Duy-Phuc Pham, Damien Marion, Matthieu Mastio, and Annelie Heuser. 2021. Obfuscation Revealed: Leveraging Electromagnetic Signals for Obfuscated Malware Classification. In Annual Computer Security Applications Conference (ACSAC). Association for Computing Machinery, New York, NY, USA, 706–719. DOI:https://doi.org/10.1145/3485832.3485894

Structure of the wiki

The wiki structure is as follows

The dataset for malware and benign executable (see Section 4 in the paper).
Data acquisition code for reproducing the EM traces capture (see Section 5.2-6.1 in the paper).
Pre-trained models contains all the pre-trained models for each scenario of ML and DL algorithms (see Section 6.2 in the paper).
Analysis tools to reproduce the results of Machine Learning (ML) and Deep Learning (DL) models (Section 7 in the paper).

Malware and benign dataset

Requirements

The dataset contains compiled ARM executables for both malware and benign dataset. Executables were compiled on Linux raspberrypi 4.19.57-v7+ ARM.

Usage

All executables can be executed directly on target device. The dataset was categorised in 5 different families: bashlite, gonnacry, mirai, rootkit, goodware. Except rootkits which require to be installed as follow:

Keysniffer rootkit

Rootkit installation:

sudo insmod kisni-4.19.57-v7+.ko

For rootkit uninstallation:

sudo rmmod kisni-4.19.57-v7+.ko

MaK_It rookit

Run it only once per target device reboot:

ARG1=".maK_it"
ARG2="33"
rm -f /dev/$ARG1 #Making sure it's cleared
echo "Creating virtual device /dev/$ARG1"
mknod /dev/$ARG1 c $ARG2 0
chmod 777 /dev/$ARG1
echo "Keys will be logged to virtual device."

For rootkit uninstallation:

echo "debug" > /dev/.maK_it ; echo "modReveal" > /dev/.maK_it; #Un-hide rookit
sudo rmmod maK_it4.19.57-v7+.ko; #Uninstall rootkit

For rootkit installation:

sudo insmod maK_it4.19.57-v7+.ko

For details of commands to execute malware on target device, please refer to subfolder cmdFiles

Note: This repository is made for research purpose. We are not liable or responsible for any damage caused by the installation of viruses or malware on your computer, software, equipment or other property due to your access to this repository or any other use of this repository.

Data acquisition

The current repository contains all the scripts needed to interact with data acquisition interfaces published in the paper: "Obfuscation Revealed: Electromagnetic obfuscated malware classification".

Requirements

This repository supports PicoScope® 6000 Series oscilloscope. To install required Python packages:

pip install -r requirements.txt

Data acquisition setup

Target device

We use Raspberry Pi (1,2,3) in our setup. It is connected to the host analysis machine over Ethernet via SSH. The SSH IP configuration can be modified in generate_traces_pico.py .

ssh.connect('192.168.1.177', username='pi')

Oscilloscope, Amplifiers and Probe

We use Langer PA-303 +30dB for amplifier, connected to a H-Field Probe (Langer RF-R 0.3-3) and Picoscope 6407 1GHz bandwith. The probe through amplifier is connected to port A, while the trigger from target device is connected to port B of the Picoscope.

Wrapper configuration

To trigger the oscilloscope, we launch a wrapper program on the device. This wrapper will simply send the trigger and launch the program we want to monitor for the according time. It is automatically called by generate_traces_pico.py. You just need to precise its path on the monitored device. The compiled wrapper can be stored in /home/pi/wrapper or its path can be modified in generate_traces_pico.py . The wrapper has already configured Raspberry Pi Plug P1 pin 11, which is GPIO pin 17, as the trigger input for the oscilloscope.

Command file

You now need to provide the list of commands you want to monitor in a CSV-like file cmdFile.

The file must be of this form: pretrigger-command,command,tagEvery loop iteration will, for each line of the cmdFile, do the following:

Execute the pretrigger command on the device via SSH
Arm the oscilloscope
Trigger the oscilloscope and execute the monitored command
Record the data in a file named tag-$randomId.dat

Example of a command file for launching keysniffer:

sudo rmmod kisni,./keyemu/emu.sh A 10,keyemu
sudo insmod keysniffer/kisni-4.19.57-v7+.ko,./keyemu/emu.sh A 10,keyemu_kisni

Launch process traces capture

Example of traces capture:

./generate_traces_pico.py ./cmdFiles/cmdFile_bashlite.csv -c 3000 -d ./bashlite-2.43s-2Mss/ -t B --timebase 80 -n 5000000

This will capture 3000 traces from the oscilloscope, execute Bashlite malware on the target device with the path defined in cmdFile_bashlite.csv, and output traces to folder ./bashlite-2.43s-2Mss on host analysis machine. The oscilloscope will be executed in Block mode with sampling frequency "80". For more details please refer to data-acquisition repository.

Pre-trained models

This repository contains all the pre-trained models for each scenario and each Deep Learning (DL) and Machine Learning (ML) algorithms. Deep Learning models are compressed in 7z format, they need to be uncompressed before they can be used with other modules, use run_decompression.sh to decompress files.

Analysis Tools

Validation of test dataset

Requirements

To be able to run the analysis you (might) need python 3.6 and the required packages:

pip install -r requirements.txt

Test dataset

Two dataset are available to reproduce the results on the following website

https://zenodo.org/record/5414107

The two dataset are:

traces_selected_bandwidth.zip: the extracted bandwidth (40) of spectrograms from the testing dataset to reproduce the classification results presented in the paper,
raw_data_reduced_dataset.zip: a reduce set of the raw electromagnetic traces to reproduce the end-to-end process (pre-processing and classification).

Evaluation of test dataset

Initialization

In order to update the location of the data, you previously dowloaded, inside the lists you need to run the script update_lists.sh:

./update_lists  [directory where the lists are stored] [directory where the (downloaded) traces are stored]

This must be applyed to directoies list_selected_bandwidth and list_reduced_dataset respectively associated to the datasets: traces_selected_bandwidth.zip and raw_data_reduced_dataset.zip

For example:

./update_lists  ./lists_selected_bandwidth/ ./traces_selected_bandwidth

Evaluation of Machine Learning (ML)

To run the computation of the all the machine learning experiments, you can use the scripts run_ml_on_reduced_dataset.sh and run_ml_on_extracted_bandwidth.sh:

./run_ml_on_extracted_bandwidth.sh  [directory where the lists are stored] [directory where the models are stored] [directory where the accumulated data is stored (precomputed in pretrained_models/ACC) ]

The results are stored in the file ml_analysis/log-evaluation_selected_bandwidth.txt. Models and accumulators are available in the repository named pretrained_models.

For example:

./run_ml_on_extracted_bandwidth.sh lists_selected_bandwidth/ ../pretrained_models/ ../pretrained_models/ACC

./run_ml_on_reduced_dataset.sh

The results are stored in the file ml_analysis/log-evaluation_reduced_dataset.txt.

Evaluation of Deep Learning (DL)

To run the computation of all the deep learning experiments on the testing dataset with pre-trained models, you can use the script run_dl_on_selected_bandwidth.sh:

./run_dl_on_selected_bandwidth.sh  [directory where the lists are stored] [parent directory where the models are stored with subdirectories MLP/ and CNN/ (precomputed in pretrained_models/{CNN and MLP})] [directory where the accumulated data is stored (precomputed in pretrained_models/ACC) ]

The results are stored in the file evaluation_log_DL.txt.

For example:

./run_dl_on_selected_bandwidth.sh ../lists_selected_bandwidth/ ../pretrained_models/ ../pre-acc/

To train and store pre-trained models for the MLP and CNN architecture using the reduced dataset (downloaded from zenodo), you can use the script run_dl_on_reduced_dataset.sh:

./run_dl_on_reduced_dataset.sh  [directory where the lists are stored] [directory where the accumulated data is stored (precomputed in pretrained_models/ACC) ] [DL architecture {cnn or mlp}] [number of epochs (e.g. 100)] [batch size (e.g. 100)]

The models are stored as h5-files in the same directory with the name of the classification scenario. Validation accuracies over all scenarios and bandwidths are stored in training_log_reduced_dataset_{mlp,cnn}.txt.

Results with "the extracted bandwidth" dataset


scenario	#	MLP AC [ ${\epsilon}_{opt}$ ]	CNN AC [ ${\epsilon}_{opt}$ ]	LDA + NB AC [ ${\epsilon}_{opt}$ ]	LDA + NB AC [ ${\epsilon}_{opt}$ ]
Type	4	99.75% [28]	99.82% [28]	97.97% [22]	98.07% [22]
Family	2	98.57% [28]	99.61% [28]	97.19% [28]	97.27% [28]
Virtualization	2	95.60% [20]	95.83% [24]	91.29% [6]	91.25% [6]
Packer	2	93.39% [28]	94.96% [20]	83.62% [16]	83.58% [16]
Obfuscation	7	73.79% [28]	82.70% [24]	64.29% [10]	64.47% [10]
Executable	35	73.56% [24]	82.28% [24]	70.92% [28]	71.84% [28]
Novelty (familly)	5	88.41% [16]	98.85% [24]	98.25% [6]	98.61% [10]

Media coverage

Using EM Waves to Detect Malware. Schneier on security. January 15, schneier. (n.d.). Retrieved January 21, 2022, from https://www.schneier.com/blog/archives/2022/01/using-em-waves-to-detect-malware.html
‘Skadlig kod kan upptäckas med elektromagnetiska vågor’. Computer Sweden. Accessed 21 January 2022. https://computersweden.idg.se/2.2683/1.761341/skadlig-kod-kan-upptackas-med-elektromagnetiska-vagor.
‘Identifying Malware By Sniffing Its EM Signature’. Tom Nardi. Hackaday (blog), 19 January 2022. https://hackaday.com/2022/01/19/identifying-malware-by-sniffing-its-em-signature/.
Tracy, P. (2022, January 12). Raspberry pi can detect malware by scanning for electromagnetic waves. Gizmodo. Retrieved January 21, 2022, from https://gizmodo.com/raspberry-pi-can-detect-malware-by-scanning-for-electro-1848339130
新知答主. (n.d.). 探测电磁波就能揪出恶意软件，网友：搁这给电脑把脉呢？. zhuanlan. Retrieved January 21, 2022, from https://zhuanlan.zhihu.com/p/457343853
Detecting evasive malware on IOT devices using electromagnetic emanations. The Hacker News. (2022, January 6). Retrieved January 11, 2022, from https://thehackernews.com/2022/01/detecting-evasive-malware-on-iot.html
Matthew is PCMag's UK-based editor and news reporter. Prior to joining the team. (2022, January 10). No software required: Raspberry Pi uses electromagnetic waves to detect malware. PCMag UK. Retrieved January 11, 2022, from https://uk.pcmag.com/malware-protection-removal/138056/no-software-required-raspberry-pi-uses-electromagnetic-waves-to-detect-malware
(2022, January 11). Raspberry pi peut désormais Détecter Les malwares sans logiciel. hitechglitz.com. Retrieved January 11, 2022, from https://hitechglitz.com/france/raspberry-pi-peut-desormais-detecter-les-malwares-sans-logiciel/
Hill, A. (2022, January 9). Raspberry pi detects malware using electromagnetic waves. Tom's Hardware. Retrieved January 11, 2022, from https://www.tomshardware.com/news/raspberry-pi-detects-malware-with-em-waves
(2022, January 11). Raspberry pi peut désormais détecter Les Logiciels malveillants sans Aucun Logiciel. Lesnumerics.com - Croire a la tecnologie. Retrieved January 11, 2022, from https://lesnumerics.com/raspberry-pi-peut-desormais-detecter-les-logiciels-malveillants-sans-aucun-logiciel
Nihel Béranger (2022, January 11). Raspberry Pi Peut Détecter des virus Grâce aux Ondes électromagnétiques. Confluence News. Retrieved January 11, 2022, from https://confluencenews.fr/raspberry-pi-peut-detecter-des-virus-grace-aux-ondes-electromagnetiques/
Spadafora, A. (2022, January 11). Raspberry pi can now detect malware without any software. TechRadar. Retrieved January 11, 2022, from https://www.techradar.com/news/raspberry-pi-can-now-detect-malware-without-any-software
Gabriel. (2022, January 10). Un appareil basé sur raspberry pi utilise des ondes électromagnétiques pour détecter Les Logiciels malveillants. NetCost & Security. Retrieved January 11, 2022, from https://www.netcost-security.fr/actualites/69241/un-appareil-base-sur-raspberry-pi-utilise-des-ondes-electromagnetiques-pour-detecter-les-logiciels-malveillants/
Singh, J. (2022, January 11). Raspberry pi can now be used to detect malware using electromagnetic waves. NDTV Gadgets 360. Retrieved January 11, 2022, from https://gadgets.ndtv.com/laptops/news/raspberry-pi-malware-detection-system-electromagnetic-waves-irisa-researchers-2701646

MIT License

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

Obfuscation Revealed: Leveraging Electromagnetic Signals for Obfuscated Malware Classification

Publications

Structure of the wiki

Malware and benign dataset

Requirements

Usage

Keysniffer rootkit

MaK_It rookit

Data acquisition

Requirements

Data acquisition setup

Target device

Oscilloscope, Amplifiers and Probe

Wrapper configuration

Command file

Launch process traces capture

Pre-trained models

Analysis Tools

Validation of test dataset

Requirements

Test dataset

Evaluation of test dataset

Results with "the extracted bandwidth" dataset

Media coverage

Clone this wiki locally