This is the code for our paper DAAIN: Detection of Anomalous and AdversarialInput using Normalizing Flows

Overview

Merantix-Labs: DAAIN

This is the code for our paper DAAIN: Detection of Anomalous and Adversarial Input using Normalizing Flows which can be found at arxiv.

Assumptions

There are assumptions:

  • The training data PerturbedDataset makes some assumptions about the data:
    • the ignore_index is 255
    • num_classes = 19
    • the images are resized with size == 512

Module Overview

A selection of the files with some pointers what to find where

├── configs                                   # The yaml configs
│   ├── activation_spaces
│   │   └── esp_net_256_512.yaml
│   ├── backbone
│   │   ├── esp_dropout.yaml
│   │   └── esp_net.yaml
│   ├── dataset_paths
│   │   ├── bdd100k.yaml
│   │   └── cityscapes.yaml
│   ├── data_creation.yaml                    # Used to create the training and testing data in one go
│   ├── detection_inference.yaml              # Used for inference
│   ├── detection_training.yaml               # Used for training
│   ├── esp_dropout_training.yaml             # Used to train the MC dropout baseline
│   └── paths.yaml
├── README.md                                 # This file!
├── requirements.in                           # The requirements
├── setup.py
└── src
   └── daain
       ├── backbones                          # Definitions of the backbones, currently only a slighlty modified version
       │   │                                  # of the ESPNet was tested
       │   ├── esp_dropout_net
       │   │   ├── esp_dropout_net.py
       │   │   ├── __init__.py
       │   │   ├── lightning_module.py
       │   │   └── trainer
       │   │       ├── criteria.py
       │   │       ├── data.py
       │   │       ├── dataset_collate.py
       │   │       ├── data_statistics.py
       │   │       ├── __init__.py
       │   │       ├── iou_eval.py
       │   │       ├── README.md
       │   │       ├── trainer.py            # launch this file to train the ESPDropoutNet
       │   │       ├── transformations.py
       │   │       └── visualize_graph.py
       │   └── esp_net
       │       ├── espnet.py                 # Definition of the CustomESPNet
       │       └── layers.py
       ├── baseline
       │   ├── maximum_softmax_probability.py
       │   ├── max_logit.py
       │   └── monte_carlo_dropout.py
       ├── config_schema
       ├── constants.py                      # Some constants, the last thing to refactor...
       ├── data                              # General data classes
       │   ├── datasets
       │   │   ├── bdd100k_dataset.py
       │   │   ├── cityscapes_dataset.py
       │   │   ├── labels
       │   │   │   ├── bdd100k.py
       │   │   │   ├── cityscape.py
       │   │   └── semantic_segmentation_dataset.py
       │   ├── activations_dataset.py        # This class loads the recorded activations
       │   └── perturbed_dataset.py          # This class loads the attacked images
       ├── model
       │   ├── aggregation_mode.py           # Not interesting for inference
       │   ├── classifiers.py                # All classifiers used are defined here
       │   ├── model.py                      # Probably the most important module. Check this for an example on how
       │   │                                 # to used the detection model and how to load the parts
       │   │                                 # (normalising_flow & classifier)
       │   └── normalising_flow
       │       ├── coupling_blocks
       │       │   ├── attention_blocks
       │       │   ├── causal_coupling_bock.py  # WIP
       │       │   └── subnet_constructors.py
       │       └── lightning_module.py
       ├── scripts
       │   └── data_creation.py              # Use this file to create the training and testing data
       ├── trainer                           # Trainer of the full detection model
       │   ├── data.py                       # Loading the data...
       │   └── trainer.py
       ├── utils                             # General utils
       └── visualisations                    # Visualisation helpers

Parts

In general the model consists of two parts:

  • Normalising FLow
  • Classifier / Scoring method

Both have to be trained separately, depending on the classifier. Some are parameter free (except for the threshold).

The general idea can be summarised:

  1. Record the activations of the backbone model at specific locations during a forward pass.
  2. Transform the recorded activations using a normalising flow and map them to a standard Gaussian for each variable.
  3. Apply some simple (mostly distance based) classifier on the transformed activations to get the anomaly score.

Training & Inference Process

  1. Generate perturbed and adversarial images. We do not provide code for this step.
  2. Generate the activations using src/daain/scripts/data_creation.py
  3. Train the detection model using src/daain/trainer/trainer.py
  4. Use src/daain/model/model.py to load the trained model and use it to get the anomaly score (the probability that the input was anomalous).
Owner
Merantix
Merantix
Official implementation of "An Image is Worth 16x16 Words, What is a Video Worth?" (2021 paper)

An Image is Worth 16x16 Words, What is a Video Worth? paper Official PyTorch Implementation Gilad Sharir, Asaf Noy, Lihi Zelnik-Manor DAMO Academy, Al

213 Nov 12, 2022
A real-time dolly zoom camera effect

Dolly-Zoom I've always been amazed by the gradual perspective change of dolly zoom, and I have some experience in python and OpenCV, so I decided to c

Dylan Kai Lau 52 Dec 08, 2022
This repository provides train&test code, dataset, det.&rec. annotation, evaluation script, annotation tool, and ranking.

SCUT-CTW1500 Datasets We have updated annotations for both train and test set. Train: 1000 images [images][annos] Additional point annotation for each

Yuliang Liu 600 Dec 18, 2022
Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Vedaldi, Andrew Zisserman, CVPR 2016.

SynthText Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Ved

Ankush Gupta 1.8k Dec 28, 2022
SemTorch

SemTorch This repository contains different deep learning architectures definitions that can be applied to image segmentation. All the architectures a

David Lacalle Castillo 154 Dec 07, 2022
Convolutional Recurrent Neural Network (CRNN) for image-based sequence recognition.

Convolutional Recurrent Neural Network This software implements the Convolutional Recurrent Neural Network (CRNN), a combination of CNN, RNN and CTC l

Baoguang Shi 2k Dec 31, 2022
Demo processor to illustrate OCR-D Python API

ocrd_vandalize/ Demo processor to illustrate the OCR-D/core Python API Description :TODO: write docs :) Installation From PyPI pip3 install ocrd_vanda

Konstantin Baierer 5 May 05, 2022
Qrcode Attendence System with Opencv and Pyzbar

Setup process Creates a virtual environment (Scripts that ensure executed Python code uses the Python interpreter and site packages installed inside t

Ganesh 5 Aug 01, 2022
ISI's Optical Character Recognition (OCR) software for machine-print and handwriting data

VistaOCR ISI's Optical Character Recognition (OCR) software for machine-print and handwriting data Publications "How to Efficiently Increase Resolutio

ISI Center for Vision, Image, Speech, and Text Analytics 21 Dec 08, 2021
Pre-Recognize Library - library with algorithms for improving OCR quality.

PRLib - Pre-Recognition Library. The main aim of the library - prepare image for recogntion. Image processing can really help to improve recognition q

Alex 80 Dec 30, 2022
code for our ICCV 2021 paper "DeepCAD: A Deep Generative Network for Computer-Aided Design Models"

DeepCAD This repository provides source code for our paper: DeepCAD: A Deep Generative Network for Computer-Aided Design Models Rundi Wu, Chang Xiao,

Rundi Wu 85 Dec 31, 2022
This is the implementation of the paper "Gated Recurrent Convolution Neural Network for OCR"

Gated Recurrent Convolution Neural Network for OCR This project is an implementation of the GRCNN for OCR. For details, please refer to the paper: htt

90 Dec 22, 2022
M-LSDを用いて四角形を検出し、射影変換を行うサンプルプログラム

M-LSD-warpPerspective-Example M-LSDを用いて四角形を検出し、射影変換を行うサンプルプログラムです。 Requirements OpenCV 3.4.2 or Later tensorflow 2.4.1 or Later Usage 実行方法は以下です。 pytho

KazuhitoTakahashi 9 Oct 14, 2022
An Agnostic Computer Vision Framework - Pluggable to any Training Library: Fastai, Pytorch-Lightning with more to come

An Agnostic Object Detection Framework IceVision is the first agnostic computer vision framework to offer a curated collection with hundreds of high-q

airctic 790 Jan 05, 2023
When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework (CVPR 2021 oral)

MTLFace This repository contains the PyTorch implementation and the dataset of the paper: When Age-Invariant Face Recognition Meets Face Age Synthesis

Hzzone 120 Jan 05, 2023
Machine Leaning applied to denoise images to improve OCR Accuracy

Machine Learning to Denoise Images for Better OCR Accuracy This project is an adaptation of this tutorial and used only for learning purposes: https:/

Antonio Bri Pérez 2 Nov 16, 2022
Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation

This is the official implementation of "Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation". For more details, please

Pengyuan Lyu 309 Dec 06, 2022
Framework for the Complete Gaze Tracking Pipeline

Framework for the Complete Gaze Tracking Pipeline The figure below shows a general representation of the camera-to-screen gaze tracking pipeline [1].

Pascal 20 Jan 06, 2023
An advanced 2D image manipulation with features such as edge detection and image segmentation built using OpenCV

OpenCV-ToothPaint3-Advanced-Digital-Image-Editor This application named ‘Tooth Paint’ version TP_2020.3 (64-bit) or version 3 was developed within a w

JunHong 1 Nov 05, 2021
PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector

Description This is a PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector. Only RBOX part is implemented. Using dice loss

365 Dec 20, 2022