ParmeSan: Sanitizer-guided Greybox Fuzzing

Related tags

Deep Learningparmesan
Overview

ParmeSan: Sanitizer-guided Greybox Fuzzing

License

ParmeSan is a sanitizer-guided greybox fuzzer based on Angora.

Published Work

USENIX Security 2020: ParmeSan: Sanitizer-guided Greybox Fuzzing.

The paper can be found here: ParmeSan: Sanitizer-guided Greybox Fuzzing

Building ParmeSan

See the instructions for Angora.

Basically run the following scripts to install the dependencies and build ParmeSan:

build/install_rust.sh
PREFIX=/path/to/install/llvm build/install_llvm.sh
build/install_tools.sh
build/build.sh

ParmeSan also builds a tool bin/llvm-diff-parmesan, which can be used for target acquisition.

Building a target

First build your program into a bitcode file using clang (e.g., base64.bc). Then build your target in the same way, but with your selected sanitizer enabled. To get a single bitcode file for larger projects, the easiest solution is to use gllvm.

# Build the bitcode files for target acquisition
USE_FAST=1 $(pwd)/bin/angora-clang -emit-llvm -o base64.fast.bc -c base64.bc
USE_FAST=1 $(pwd)/bin/angora-clang -fsanitize=address -emit-llvm -o base64.fast.asan.bc -c base64.bc
# Build the actual binaries to be fuzzed
USE_FAST=1 $(pwd)/bin/angora-clang -o base64.fast -c base64.bc
USE_TRACK=1 $(pwd)/bin/angora-clang -o base64.track -c base64.bc

Then acquire the targets using:

bin/llvm-diff-parmesan -json base64.fast.bc base64.fast.asan.bc

This will output a file targets.json, which you provide to ParmeSan with the -c flag.

For example:

$(pwd)/bin/fuzzer -c ./targets.json -i in -o out -t ./base64.track -- ./base64.fast -d @@

Options

ParmeSan's SanOpt option can speed up the fuzzing process by dynamically switching over to a sanitized binary only once the fuzzer reaches one of the targets specified in the targets.json file.

Enable using the -s [SANITIZED_BIN] option.

Build the sanitized binary in the following way:

USE_FAST=1 $(pwd)/bin/angora-clang -fsanitize=address -o base64.asan.fast -c base64.bc

Targets input file

The targets input file consisit of a JSON file with the following format:

{
  "targets":  [1,2,3,4],
  "edges":   [[1,2], [2,3]],
  "callsite_dominators": {"1": [3,4,5]}
}

Where the targets denote the identify of the cmp instruction to target (i.e., the id assigned by the __angora_trace_cmp() calls) and edges is the overlay graph of cmp ids (i.e., which cmps are connected to each other). The edges filed can be empty, since ParmeSan will add newly discovered edges automatically, but note that the performance will be better if you provide the static CFG.

It is also possible to run ParmeSan in pure directed mode (-D option), meaning that it will only consider new seeds if the seed triggers coverage that is on a direct path to one of the specified targets. Note that this requires a somewhat complete static CFG to work (an incomplete CFG might contain no paths to the targets at all, which would mean that no new coverage will be considered at all).

ParmeSan Screenshot

How to get started

Have a look at BUILD_TARGET.md for a step-by-step tutorial on how to get started fuzzing with ParmeSan.

FAQ

  • Q: I get a warning like ==1561377==WARNING: DataFlowSanitizer: call to uninstrumented function gettext when running the (track) instrumented program.
  • A: In many cases you can ignore this, but it will lose the taint (meaning worse performance). You need to add the function to the abilist (e.g., llvm_mode/dfsan_rt/dfsan/done_abilist.txt) and add a custom DFSan wrapper (in llvm_mode/dfsan_rt/dfsan/dfsan_custom.cc). See the Angora documentation for more info.
  • Q: I get an compiler error when building the track binary.
  • A: ParmeSan/ Angora uses DFSan for dynamic data-flow analysis. In certain cases building target applications can be a bit tricky (especially in the case of C++ targets). Make sure to disable as much inline assembly as possible and make sure that you link the correct libraries/ llvm libc++. Some programs also do weird stuff like an indirect call to a vararg function. This is not supported by DFSan at the moment, so the easy solution is to patch out these calls, or do something like indirect call promotion.
  • Q: llvm-diff-parmesan generates too many targets!
  • A: You can do target pruning using the scripts in tools/ (in particular tools/prune.py) or use ASAP to generate a target bitcode file with fewer sanitizer targets.

Docker image

You can also get the pre-built docker image of ParmeSan.

docker pull vusec/parmesan
docker run --rm -it vusec/parmesan
# In the container you can build objdump
/parmesan/misc/build_objdump.sh
Owner
VUSec
VUSec
Study of human inductive biases in CNNs and Transformers.

Are Convolutional Neural Networks or Transformers more like human vision? This repository contains the code and fine-tuned models of popular Convoluti

Shikhar Tuli 39 Dec 08, 2022
Few-shot Neural Architecture Search

One-shot Neural Architecture Search uses a single supernet to approximate the performance each architecture. However, this performance estimation is super inaccurate because of co-adaption among oper

Yiyang Zhao 38 Oct 18, 2022
Learning Features with Parameter-Free Layers (ICLR 2022)

Learning Features with Parameter-Free Layers (ICLR 2022) Dongyoon Han, YoungJoon Yoo, Beomyoung Kim, Byeongho Heo | Paper NAVER AI Lab, NAVER CLOVA Up

NAVER AI 65 Dec 07, 2022
PyTorch implementation of DeepUME: Learning the Universal Manifold Embedding for Robust Point Cloud Registration (BMVC 2021)

DeepUME: Learning the Universal Manifold Embedding for Robust Point Cloud Registration [video] [paper] [supplementary] [data] [thesis] Introduction De

Natalie Lang 10 Dec 14, 2022
The code uses SegFormer for Semantic Segmentation on Drone Dataset.

SegFormer_Segmentation The code uses SegFormer for Semantic Segmentation on Drone Dataset. The details for the SegFormer can be obtained from the foll

Dr. Sander Ali Khowaja 1 May 08, 2022
Official pytorch implementation of "Scaling-up Disentanglement for Image Translation", ICCV 2021.

Official pytorch implementation of "Scaling-up Disentanglement for Image Translation", ICCV 2021.

Aviv Gabbay 41 Nov 29, 2022
Anagram Generator in Python

Anagrams Generator This is a program for computing multiword anagrams. It makes no effort to come up with sentences that make sense; it only finds ana

Day Fundora 5 Nov 17, 2022
How Do Adam and Training Strategies Help BNNs Optimization? In ICML 2021.

AdamBNN This is the pytorch implementation of our paper "How Do Adam and Training Strategies Help BNNs Optimization?", published in ICML 2021. In this

Zechun Liu 47 Sep 20, 2022
The final project of "Applying AI to 3D Medical Imaging Data" from "AI for Healthcare" nanodegree - Udacity.

Quantifying Hippocampus Volume for Alzheimer's Progression Background Alzheimer's disease (AD) is a progressive neurodegenerative disorder that result

Omar Laham 1 Jan 14, 2022
Keywords : Streamlit, BertTokenizer, BertForMaskedLM, Pytorch

Next Word Prediction Keywords : Streamlit, BertTokenizer, BertForMaskedLM, Pytorch 🎬 Project Demo ✔ Application is hosted on Streamlit. You can see t

Vivek7 3 Aug 26, 2022
Rainbow: Combining Improvements in Deep Reinforcement Learning

Rainbow Rainbow: Combining Improvements in Deep Reinforcement Learning [1]. Results and pretrained models can be found in the releases. DQN [2] Double

Kai Arulkumaran 1.4k Dec 29, 2022
The official implementation for "FQ-ViT: Fully Quantized Vision Transformer without Retraining".

FQ-ViT [arXiv] This repo contains the official implementation of "FQ-ViT: Fully Quantized Vision Transformer without Retraining". Table of Contents In

132 Jan 08, 2023
efficient neural audio synthesis in the waveform domain

neural waveshaping synthesis real-time neural audio synthesis in the waveform domain paper • website • colab • audio by Ben Hayes, Charalampos Saitis,

Ben Hayes 169 Dec 23, 2022
Unofficial PyTorch Implementation of "Augmenting Convolutional networks with attention-based aggregation"

Pytorch Implementation of Augmenting Convolutional networks with attention-based aggregation This is the unofficial PyTorch Implementation of "Augment

DK 20 Sep 09, 2022
TART - A PyTorch implementation for Transition Matrix Representation of Trees with Transposed Convolutions

TART This project is a PyTorch implementation for Transition Matrix Representati

Lee Sael 2 Jan 19, 2022
Exposure Time Calculator (ETC) and radial velocity precision estimator for the Near InfraRed Planet Searcher (NIRPS) spectrograph

NIRPS-ETC Exposure Time Calculator (ETC) and radial velocity precision estimator for the Near InfraRed Planet Searcher (NIRPS) spectrograph February 2

Nolan Grieves 2 Sep 15, 2022
G-NIA model from "Single Node Injection Attack against Graph Neural Networks" (CIKM 2021)

Single Node Injection Attack against Graph Neural Networks This repository is our Pytorch implementation of our paper: Single Node Injection Attack ag

Shuchang Tao 18 Nov 21, 2022
In this project we use both Resnet and Self-attention layer for cat, dog and flower classification.

cdf_att_classification classes = {0: 'cat', 1: 'dog', 2: 'flower'} In this project we use both Resnet and Self-attention layer for cdf-Classification.

3 Nov 23, 2022
Code for ACL 21: Generating Query Focused Summaries from Query-Free Resources

marge This repository releases the code for Generating Query Focused Summaries from Query-Free Resources. Please cite the following paper [bib] if you

Yumo Xu 28 Nov 10, 2022
One line to host them all. Bootstrap your image search case in minutes.

One line to host them all. Bootstrap your image search case in minutes. Survey NOW gives the world access to customized neural image search in just on

Jina AI 403 Dec 30, 2022