MemStream: Memory-Based Anomaly Detection in Multi-Aspect Streams with Concept Drift

Overview

MemStream

Implementation of

MemStream detects anomalies from a multi-aspect data stream. We output an anomaly score for each record. MemStream is a memory augmented feature extractor, allows for quick retraining, gives a theoretical bound on the memory size for effective drift handling, is robust to memory poisoning, and outperforms 11 state-of-the-art streaming anomaly detection baselines.

After an initial training of the feature extractor on a small subset of normal data, MemStream processes records in two steps: (i) It outputs anomaly scores for each record by querying the memory for K-nearest neighbours to the record encoding and calculating a discounted distance and (ii) It updates the memory, in a FIFO manner, if the anomaly score is within an update threshold β.

Demo

  1. KDDCUP99: Run python3 memstream.py --dataset KDD --beta 1 --memlen 256
  2. NSL-KDD: Run python3 memstream.py --dataset NSL --beta 0.1 --memlen 2048
  3. UNSW-NB 15: Run python3 memstream.py --dataset UNSW --beta 0.1 --memlen 2048
  4. CICIDS-DoS: Run python3 memstream.py --dataset DOS --beta 0.1 --memlen 2048
  5. SYN: Run python3 memstream-syn.py --dataset SYN --beta 1 --memlen 16
  6. Ionosphere: Run python3 memstream.py --dataset ionosphere --beta 0.001 --memlen 4
  7. Cardiotocography: Run python3 memstream.py --dataset cardio --beta 1 --memlen 64
  8. Statlog Landsat Satellite: Run python3 memstream.py --dataset statlog --beta 0.01 --memlen 32
  9. Satimage-2: Run python3 memstream.py --dataset satimage-2 --beta 10 --memlen 256
  10. Mammography: Run python3 memstream.py --dataset mammography --beta 0.1 --memlen 128
  11. Pima Indians Diabetes: Run python3 memstream.py --dataset pima --beta 0.001 --memlen 64
  12. Covertype: Run python3 memstream.py --dataset cover --beta 0.0001 --memlen 2048

Command line options

  • --dataset: The dataset to be used for training. Choices 'NSL', 'KDD', 'UNSW', 'DOS'. (default 'NSL')
  • --beta: The threshold beta to be used. (default: 0.1)
  • --memlen: The size of the Memory Module (default: 2048)
  • --dev: Pytorch device to be used for training like "cpu", "cuda:0" etc. (default: 'cuda:0')
  • --lr: Learning rate (default: 0.01)
  • --epochs: Number of epochs (default: 5000)

Input file format

MemStream expects the input multi-aspect record stream to be stored in a contains , separated file.

Datasets

Processed Datasets can be downloaded from here. Please unzip and place the files in the data folder of the repository.

  1. KDDCUP99
  2. NSL-KDD
  3. UNSW-NB 15
  4. CICIDS-DoS
  5. Synthetic Dataset (Introduced in paper)
  6. Ionosphere
  7. Cardiotocography
  8. Statlog Landsat Satellite
  9. Satimage-2
  10. Mammography
  11. Pima Indians Diabetes
  12. Covertype

Environment

This code has been tested on Debian GNU/Linux 9 with a 12GB Nvidia GeForce RTX 2080 Ti GPU, CUDA Version 10.2 and PyTorch 1.5.

Owner
Stream-AD
Streaming Anomaly Detection
Stream-AD
Deep functional residue identification

DeepFRI Deep functional residue identification Citing @article {Gligorijevic2019, author = {Gligorijevic, Vladimir and Renfrew, P. Douglas and Koscio

Flatiron Institute 156 Dec 25, 2022
Official page of Struct-MDC (RA-L'22 with IROS'22 option); Depth completion from Visual-SLAM using point & line features

Struct-MDC (click the above buttons for redirection!) Official page of "Struct-MDC: Mesh-Refined Unsupervised Depth Completion Leveraging Structural R

Urban Robotics Lab. @ KAIST 37 Dec 22, 2022
XViT - Space-time Mixing Attention for Video Transformer

XViT - Space-time Mixing Attention for Video Transformer This is the official implementation of the XViT paper: @inproceedings{bulat2021space, title

Adrian Bulat 33 Dec 23, 2022
Official Repo for Ground-aware Monocular 3D Object Detection for Autonomous Driving

Visual 3D Detection Package: This repo aims to provide flexible and reproducible visual 3D detection on KITTI dataset. We expect scripts starting from

Yuxuan Liu 305 Dec 19, 2022
Yolov5+SlowFast: Realtime Action Detection Based on PytorchVideo

Yolov5+SlowFast: Realtime Action Detection A realtime action detection frame work based on PytorchVideo. Here are some details about our modification:

WuFan 181 Dec 30, 2022
DeLiGAN - This project is an implementation of the Generative Adversarial Network

This project is an implementation of the Generative Adversarial Network proposed in our CVPR 2017 paper - DeLiGAN : Generative Adversarial Net

Video Analytics Lab -- IISc 110 Sep 13, 2022
🔎 Super-scale your images and run experiments with Residual Dense and Adversarial Networks.

Image Super-Resolution (ISR) The goal of this project is to upscale and improve the quality of low resolution images. This project contains Keras impl

idealo 4k Jan 08, 2023
Epidemiology analysis package

zEpid zEpid is an epidemiology analysis package, providing easy to use tools for epidemiologists coding in Python 3.5+. The purpose of this library is

Paul Zivich 111 Jan 08, 2023
AttGAN: Facial Attribute Editing by Only Changing What You Want (IEEE TIP 2019)

News 11 Jan 2020: We clean up the code to make it more readable! The old version is here: v1. AttGAN TIP Nov. 2019, arXiv Nov. 2017 TensorFlow impleme

Zhenliang He 568 Dec 14, 2022
Barbershop: GAN-based Image Compositing using Segmentation Masks (SIGGRAPH Asia 2021)

Barbershop: GAN-based Image Compositing using Segmentation Masks Barbershop: GAN-based Image Compositing using Segmentation Masks Peihao Zhu, Rameen A

Peihao Zhu 928 Dec 30, 2022
A PyTorch implementation of Radio Transformer Networks from the paper "An Introduction to Deep Learning for the Physical Layer".

An Introduction to Deep Learning for the Physical Layer An usable PyTorch implementation of the noisy autoencoder infrastructure in the paper "An Intr

Gram.AI 120 Nov 21, 2022
Code for the ICME 2021 paper "Exploring Driving-Aware Salient Object Detection via Knowledge Transfer"

TSOD Code for the ICME 2021 paper "Exploring Driving-Aware Salient Object Detection via Knowledge Transfer" Usage For training, open train_test, run p

Jinming Su 2 Dec 23, 2021
ParaGen is a PyTorch deep learning framework for parallel sequence generation

ParaGen is a PyTorch deep learning framework for parallel sequence generation. Apart from sequence generation, ParaGen also enhances various NLP tasks, including sequence-level classification, extrac

Bytedance Inc. 169 Dec 22, 2022
Practical and Real-world applications of ML based on the homework of Hung-yi Lee Machine Learning Course 2021

Machine Learning Theory and Application Overview This repository is inspired by the Hung-yi Lee Machine Learning Course 2021. In that course, professo

SilenceJiang 35 Nov 22, 2022
This is the research repository for Vid2Doppler: Synthesizing Doppler Radar Data from Videos for Training Privacy-Preserving Activity Recognition.

Vid2Doppler: Synthesizing Doppler Radar Data from Videos for Training Privacy-Preserving Activity Recognition This is the research repository for Vid2

Future Interfaces Group (CMU) 26 Dec 24, 2022
LEAP: Learning Articulated Occupancy of People

LEAP: Learning Articulated Occupancy of People Paper | Video | Project Page This is the official implementation of the CVPR 2021 submission LEAP: Lear

Neural Bodies 60 Nov 18, 2022
Single Red Blood Cell Hydrodynamic Traps Via the Generative Design

Rbc-traps-generative-design - The generative design for single red clood cell hydrodynamic traps using GEFEST framework

Natural Systems Simulation Lab 4 Jun 16, 2022
[ICCV'2021] "SSH: A Self-Supervised Framework for Image Harmonization", Yifan Jiang, He Zhang, Jianming Zhang, Yilin Wang, Zhe Lin, Kalyan Sunkavalli, Simon Chen, Sohrab Amirghodsi, Sarah Kong, Zhangyang Wang

SSH: A Self-Supervised Framework for Image Harmonization (ICCV 2021) code for SSH Representative Examples Main Pipeline RealHM DataSet Google Drive Pr

VITA 86 Dec 02, 2022
This repository contains python code necessary to replicated the experiments performed in our paper "Invariant Ancestry Search"

InvariantAncestrySearch This repository contains python code necessary to replicated the experiments performed in our paper "Invariant Ancestry Search

Phillip Bredahl Mogensen 0 Feb 02, 2022
Chinese clinical named entity recognition using pre-trained BERT model

Chinese clinical named entity recognition (CNER) using pre-trained BERT model Introduction Code for paper Chinese clinical named entity recognition wi

Xiangyang Li 109 Dec 14, 2022