Code for "My(o) Armband Leaks Passwords: An EMG and IMU Based Keylogging Side-Channel Attack" paper

Overview

Myo Keylogging

This is the source code for our paper My(o) Armband Leaks Passwords: An EMG and IMU Based Keylogging Side-Channel Attack by Matthias Gazzari, Annemarie Mattmann, Max Maass and Matthias Hollick in Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Volume 5, Issue 4, 2021.

We include the software used for recording the dataset (record folder) and the software for training and running the neural networks (ml folder) as well as analyzing the results (analysis folder). The scripts folder provides some helper scripts for automating batches of hyperparameter optimization, model fitting, analyses and more. The results folder includes a pickled version of the predictions of our models, on which analyses can be run, e.g. to reproduce the paper results.

Installation

To install the project, first clone the repository and change directory into the fresh clone:

git clone https://github.com/seemoo-lab/myo-keylogging.git
cd myo-keylogging

You can use a python virtual environment (or any other virtual environment of your choice):

mkvirtualenv myo --system-site-packages
workon myo

To make sure you have the newest software versions you can run an upgrade:

pip install --upgrade pip setuptools

To install the requirements run:

pip install -r requirements.txt

Finally, import the training and test data into the project. The top level folder should include a folder train-data with all the records for training the models and a folder test-data with all the records for testing the models.

wget https://zenodo.org/record/5594651/files/myo-keylogging-dataset.zip
unzip myo-keylogging-dataset.zip

Using the record library, you can add you can extend this dataset.

Rerun of Results

To reproduce our results from the provided predictions of our models, go to the top level directory and run:

./scripts/create_results.sh

This will recreate all performance value files and plots in the subfolders of the results folder as used in the paper.

Run the following to list the fastest and slowest typists in order to determine their class imbalance in the results/train-data-skew.csv and the results/test-data-skew.csv files:

python -m analysis exp_key_data

To recreate the provided predictions and class skew files, execute the following from the top level directory:

./scripts/create_models.sh
./scripts/create_predictions.sh
./scripts/create_class_skew_files.sh

This will fit the models with the current choice of hyperparameters and run each model on the test dataset to create the required predictions for analysis. Additionally, the class skew files will be recreated.

To run the hyperparameter optimization either run the run_shallow_hpo.sh script or, alternatively, the slurm_run_shallow_hpo.sh script when on a SLURM cluster.

sbatch scripts/slurm_run_shallow_hpo.sh
./scripts/run_shallow_hpo.sh

Afterwards you can use the merge_shallow_hpo_runs.py script to combine the results for easier evaluation of the hyperparameters.

Fit Models

In order to fit and analyze your own models, go to the top level directory and run any of:

python -m ml crnn
python -m ml resnet
python -m ml resnet11
python -m ml wavenet

This will fit the respective model with the default parameters and in binary mode for keystroke detection. In order to fit multiclass models for keystroke identification, use the encoding parameter, e.g.:

python -m ml crnn --encoding "multiclass"

In order to test specific sensors, ignore the others (note that quaternions are ignored by default), e.g. to use only EMG on a CRNN model, use:

python -m ml crnn --ignore "quat" "acc" "gyro"

To run a hyperparameter optimization, run e.g.:

python -m ml crnn --func shallow_hpo --step 5

To gain more information on possible parameters, run e.g.:

python -m ml crnn --help

Some parameters for the neural networks are fixed in the code.

Analyze Models

In order to analyze your models, run apply_models to create the predictions as pickled files. On these you can run further analyses found in the analysis folder.

To run apply_models on a binary model, do:

python -m analysis apply_models --model_path results/<PATH_TO_MODEL> --encoding binary --data_path test-data/ --save_path results/<PATH_TO_PKL> --save_only --basenames <YOUR MODELS>

To run a multiclass model, do:

python -m analysis apply_models --model_path results/<PATH_TO_MODEL> --encoding multiclass --data_path test-data/ --save_path results/<PATH_TO_PKL> --save_only --basenames <YOUR MODELS>

To chain a binary and multiclass model, do e.g.:

python -m analysis apply_models --model_path results/<PATH_TO_MODEL> --encoding chain --data_path test-data/ --save_path results/<PATH_TO_PKL> --save_only --basenames <YOUR MODELS> --tolerance 10

Further parameters interesting for analyses may be a filter on the users with the parameter (--users known or --users unknown) or on the data (--data known or --data unknown) to include only users (not) in the training data or include only data typed by all or no other user respectively.

For more information, run:

python -m analysis apply_models --help

To later recreate model performance results and plots, run:

python -m analysis apply_models --encoding <ENCODING> --load_results results/<PATH_TO_PKL> --save_path results/<PATH_TO_PKL> --save_only

with the appropriate encoding of the model used to create the pickled results.

To run further analyses on the generated predictions, create or choose your analysis from the analysis folder and run:

python -m analysis <ANALYSIS_NAME>

Refer to the help for further information:

python -m analysis <ANALYSIS_NAME> --help

Record Data

In order to record your own data(set), switch to the record folder. To record sensor data with our recording software, you will need one to two Myo armbands connected to your computer. Then, you can start a training data recording, e.g.:

python tasks.py -s 42 -l german record touch_typing --left_tty <TTY_LEFT_MYO> --left_mac <MAC_LEFT_MYO> --right_tty <TTY_RIGHT_MYO> --right_mac <MAC_RIGHT_MYO> --kb_model TADA68_DE

for a German recording with seed 42, a touch typist and a TADA68 German physical keyboard layout or

python tasks.py -s 42 -l english record touch_typing --left_tty <TTY_LEFT_MYO> --left_mac <MAC_LEFT_MYO> --right_tty <TTY_RIGHT_MYO> --right_mac <MAC_RIGHT_MYO> --kb_model TADA68_US

for an English recording with seed 42, a hybrid typist and a TADA68 English physical keyboard layout.

In order to start a test data recording, simply run the passwords.py instead of the tasks.py.

After recording training data, please execute the following script to complete the meta data:

python update_text_meta.py -p ../train-data/

After recording test data, please execute the following two scripts to complete the meta data:

python update_pw_meta.py -p ../test-data/
python update_cuts.py -p ../test-data/

For further information, check:

python tasks.py --help
python passwords.py --help

Note that the recording software includes text extracts as outlined in the acknowledgments below.

Links

Acknowledgments

This work includes the following external materials to be found in the record folder:

  1. Various texts from Wikipedia available under the CC-BY-SA 3.0 license.
  2. The EFF's New Wordlists for Random Passphrases available under the CC-BY 3.0 license.
  3. An extract of the Top 1000 most common passwords by Daniel Miessler, Jason Haddix, and g0tmi1k available under the MIT license.

License

This software is licensed under the GPLv3 license, please also refer to the LICENSE file.

Owner
Secure Mobile Networking Lab
Secure Mobile Networking Lab
Dynamic Slimmable Network (CVPR 2021, Oral)

Dynamic Slimmable Network (DS-Net) This repository contains PyTorch code of our paper: Dynamic Slimmable Network (CVPR 2021 Oral). Architecture of DS-

Changlin Li 197 Dec 09, 2022
Neural Scene Flow Fields using pytorch-lightning, with potential improvements

nsff_pl Neural Scene Flow Fields using pytorch-lightning. This repo reimplements the NSFF idea, but modifies several operations based on observation o

AI葵 178 Dec 21, 2022
Code for approximate graph reduction techniques for cardinality-based DSFM, from paper

SparseCard Code for approximate graph reduction techniques for cardinality-based DSFM, from paper "Approximate Decomposable Submodular Function Minimi

Nate Veldt 1 Nov 25, 2022
ReLoss - Official implementation for paper "Relational Surrogate Loss Learning" ICLR 2022

Relational Surrogate Loss Learning (ReLoss) Official implementation for paper "R

Tao Huang 31 Nov 22, 2022
Demonstrates iterative FGSM on Apple's NeuralHash model.

apple-neuralhash-attack Demonstrates iterative FGSM on Apple's NeuralHash model. TL;DR: It is possible to apply noise to CSAM images and make them loo

Lim Swee Kiat 11 Jun 23, 2022
Convenient tool for speeding up the intern/officer review process.

icpc-app-screen Convenient tool for speeding up the intern/officer applicant review process. Eliminates the pain from reading application responses of

1 Oct 30, 2021
Effective Use of Transformer Networks for Entity Tracking

Effective Use of Transformer Networks for Entity Tracking (EMNLP19) This is a PyTorch implementation of our EMNLP paper on the effectiveness of pre-tr

5 Nov 06, 2021
A unified framework for machine learning with time series

Welcome to sktime A unified framework for machine learning with time series We provide specialized time series algorithms and scikit-learn compatible

The Alan Turing Institute 6k Jan 08, 2023
Mask-invariant Face Recognition through Template-level Knowledge Distillation

Mask-invariant Face Recognition through Template-level Knowledge Distillation This is the official repository of "Mask-invariant Face Recognition thro

Fadi Boutros 35 Dec 06, 2022
Pretrained Cost Model for Distributed Constraint Optimization Problems

Pretrained Cost Model for Distributed Constraint Optimization Problems Requirements PyTorch 1.9.0 PyTorch Geometric 1.7.1 Directory structure baseline

2 Aug 28, 2022
Implementation of fast algorithms for Maximum Spanning Tree (MST) parsing that includes fast ArcMax+Reweighting+Tarjan algorithm for single-root dependency parsing.

Fast MST Algorithm Implementation of fast algorithms for (Maximum Spanning Tree) MST parsing that includes fast ArcMax+Reweighting+Tarjan algorithm fo

Miloš Stanojević 11 Oct 14, 2022
Weighted QMIX: Expanding Monotonic Value Function Factorisation

This repo contains the cleaned-up code that was used in "Weighted QMIX: Expanding Monotonic Value Function Factorisation"

whirl 82 Dec 29, 2022
RoMA: Robust Model Adaptation for Offline Model-based Optimization

RoMA: Robust Model Adaptation for Offline Model-based Optimization Implementation of RoMA: Robust Model Adaptation for Offline Model-based Optimizatio

9 Oct 31, 2022
Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Context Terms

LESA Introduction This repository contains the official implementation of Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Cont

Chenglin Yang 20 Dec 31, 2021
Prototypical python implementation of the trust-region algorithm presented in Sequential Linearization Method for Bound-Constrained Mathematical Programs with Complementarity Constraints by Larson, Leyffer, Kirches, and Manns.

Prototypical python implementation of the trust-region algorithm presented in Sequential Linearization Method for Bound-Constrained Mathematical Programs with Complementarity Constraints by Larson, L

3 Dec 02, 2022
This GitHub repository contains code used for plots in NeurIPS 2021 paper 'Stochastic Multi-Armed Bandits with Control Variates.'

About Repository This repository contains code used for plots in NeurIPS 2021 paper 'Stochastic Multi-Armed Bandits with Control Variates.' About Code

Arun Verma 1 Nov 09, 2021
Consensus score for tripadvisor

ContripScore ContripScore is essentially a score that combines an Internet platform rating and a consensus rating from sentiment analysis (For instanc

Pepe 1 Jan 13, 2022
Code and data to accompany the camera-ready version of "Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation" in EMNLP 2021

Code and data to accompany the camera-ready version of "Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation" in EMNLP 2021

Mozhdeh Gheini 16 Jul 16, 2022
Facial expression detector

A tensorflow convolutional neural network model to detect facial expressions.

Carlos Tardón Rubio 5 Apr 20, 2022
Data, notebooks, and articles associated with the RSNA AI Deep Learning Lab at RSNA 2021

RSNA AI Deep Learning Lab 2021 Intro Welcome Deep Learners! This document provides all the information you need to participate in the RSNA AI Deep Lea

RSNA 65 Dec 16, 2022