Industrial knn-based anomaly detection for images. Visit streamlit link to check out the demo.

Overview

Industrial KNN-based Anomaly Detection

⭐ Now has streamlit support! ⭐ Run $ streamlit run streamlit_app.py

This repo aims to reproduce the results of the following KNN-based anomaly detection methods:

  1. SPADE (Cohen et al. 2021) - knn in z-space and distance to feature maps spade schematic
  2. PaDiM* (Defard et al. 2020) - distance to multivariate Gaussian of feature maps padim schematic
  3. PatchCore (Roth et al. 2021) - knn distance to avgpooled feature maps patchcore schematic

* actually does not have any knn mechanism, but shares many things implementation-wise.


Install

$ pipenv install -r requirements.txt

Note: I used torch cu11 wheels.

Usage

CLI:

$ python indad/run.py METHOD [--dataset DATASET]

Results can be found under ./results/.

Code example:

from indad.model import SPADE

model = SPADE(k=5, backbone_name="resnet18")

# feed healthy dataset
model.fit(...)

# get predictions
img_lvl_anom_score, pxl_lvl_anom_score = model.predict(...)

Custom datasets

πŸ‘οΈ

Check out one of the downloaded MVTec datasets. Naming of images should correspond among folders. Right now there is no support for no ground truth pixel masks.

πŸ“‚datasets
 β”— πŸ“‚your_custom_dataset
  ┣ πŸ“‚ ground_truth/defective
  ┃ ┣ πŸ“‚ defect_type_1
  ┃ β”— πŸ“‚ defect_type_2
  ┣ πŸ“‚ test
  ┃ ┣ πŸ“‚ defect_type_1
  ┃ ┣ πŸ“‚ defect_type_2
  ┃ β”— πŸ“‚ good
  β”— πŸ“‚ train/good
$ python indad/run.py METHOD --dataset your_custom_dataset

Results

πŸ“ = paper, πŸ‘‡ = this repo

Image-level

class SPADE πŸ“ SPADE πŸ‘‡ PaDiM πŸ“ PaDiM πŸ‘‡ PatchCore πŸ“ PatchCore πŸ‘‡
bottle - 98.3 98.3 99.9 100.0 100.0
cable - 88.1 96.7 87.8 99.5 96.2
capsule - 80.4 98.5 87.6 98.1 95.3
carpet - 62.5 99.1 99.5 98.7 98.7
grid - 25.6 97.3 95.5 98.2 93.0
hazelnut - 92.8 98.2 86.1 100.0 100.0
leather - 85.6 99.2 100.0 100.0 100.0
metal_nut - 78.6 97.2 97.6 100.0 98.3
pill - 78.8 95.7 92.7 96.6 92.8
screw - 66.1 98.5 79.6 98.1 96.7
tile - 96.4 94.1 99.5 98.7 99.0
toothbrush - 83.9 98.8 94.7 100.0 98.1
transistor - 89.4 97.5 95.0 100.0 99.7
wood - 85.3 94.7 99.4 99.2 98.8
zipper - 97.1 98.5 93.8 99.4 98.4
averages 85.5 80.6 97.5 93.9 99.1 97.7

Pixel-level

class SPADE πŸ“ SPADE πŸ‘‡ PaDiM πŸ“ PaDiM πŸ‘‡ PatchCore πŸ“ PatchCore πŸ‘‡
bottle 97.5 97.7 94.8 97.6 98.6 97.8
cable 93.7 94.4 88.8 95.5 98.5 97.4
capsule 97.6 98.7 93.5 98.1 98.9 98.3
carpet 87.4 99.0 96.2 98.7 99.1 98.3
grid 88.5 96.4 94.6 96.4 98.7 96.7
hazelnut 98.4 98.4 92.6 97.3 98.7 98.1
leather 97.2 99.1 97.8 98.6 99.3 98.4
metal_nut 99.0 96.1 85.6 95.8 98.4 96.2
pill 99.1 93.5 92.7 94.4 97.6 98.7
screw 98.1 98.9 94.4 97.5 99.4 98.4
tile 96.5 93.1 86.0 92.6 95.9 94.0
toothbrush 98.9 98.9 93.1 98.5 98.7 98.1
transistor 97.9 95.8 84.5 96.9 96.4 97.5
wood 94.1 94.5 91.1 92.9 95.1 91.9
zipper 96.5 98.3 95.9 97.0 98.9 97.6
averages 96.9 96.6 92.1 96.5 98.1 97.2

PatchCore-10 was used.

Hyperparams

The following parameters were used to calculate the results. They more or less correspond to the parameters used in the papers.

spade:
  backbone: wide_resnet50_2
  k: 50
padim:
  backbone: wide_resnet50_2
  d_reduced: 250
  epsilon: 0.04
patchcore:
  backbone: wide_resnet50_2
  f_coreset: 0.1
  n_reweight: 3

Progress

  • Datasets
  • Code skeleton
  • Config files
  • CLI
  • Logging
  • SPADE
  • PADIM
  • PatchCore
  • Add custom dataset option
  • Add dataset progress bar
  • Add schematics
  • Unit tests

Design considerations

  • Data is processed in single images to avoid batch statistics interference.
  • I decided to implement greedy kcenter from scratch and there is room for improvement.
  • torch.nn.AdaptiveAvgPool2d for feature map resizing, torch.nn.functional.interpolate for score map resizing.
  • GPU is used for backbones and coreset selection. GPU coreset selection currently runs at:
    • 400-500 it/s @ float32 (RTX3080)
    • 1000+ it/s @ float16 (RTX3080)

Acknowledgements

  • hcw-00 for tipping sklearn.random_projection.SparseRandomProjection

References

SPADE:

@misc{cohen2021subimage,
      title={Sub-Image Anomaly Detection with Deep Pyramid Correspondences}, 
      author={Niv Cohen and Yedid Hoshen},
      year={2021},
      eprint={2005.02357},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

PaDiM:

@misc{defard2020padim,
      title={PaDiM: a Patch Distribution Modeling Framework for Anomaly Detection and Localization}, 
      author={Thomas Defard and Aleksandr Setkov and Angelique Loesch and Romaric Audigier},
      year={2020},
      eprint={2011.08785},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

PatchCore:

@misc{roth2021total,
      title={Towards Total Recall in Industrial Anomaly Detection}, 
      author={Karsten Roth and Latha Pemula and Joaquin Zepeda and Bernhard SchΓΆlkopf and Thomas Brox and Peter Gehler},
      year={2021},
      eprint={2106.08265},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
Owner
aventau
Into graphics and modelling. Computer Vision / Machine Learning Engineer.
aventau
Generalized hybrid model for mode-locked laser diodes with an extended passive cavity

GenHybridMLLmodel Generalized hybrid model for mode-locked laser diodes with an extended passive cavity This hybrid simulation strategy combines a tra

Stijn Cuyvers 3 Sep 21, 2022
AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video Recognition

AdaFocusV2 This repo contains the official code and pre-trained models for AdaFo

79 Dec 26, 2022
Explainability for Vision Transformers (in PyTorch)

Explainability for Vision Transformers (in PyTorch) This repository implements methods for explainability in Vision Transformers

Jacob Gildenblat 442 Jan 04, 2023
City Surfaces: City-scale Semantic Segmentation of Sidewalk Surfaces

City Surfaces: City-scale Semantic Segmentation of Sidewalk Surfaces Paper Temporary GitHub page for City Surfaces paper. More soon! While designing s

14 Nov 10, 2022
Music library streaming app written in Flask & VueJS

djtaytay This is a little toy app made to explore Vue, brush up on my Python, and make a remote music collection accessable through a web interface. I

Ryan Tasson 6 May 27, 2022
Code for ICCV 2021 paper: ARAPReg: An As-Rigid-As Possible Regularization Loss for Learning Deformable Shape Generators..

ARAPReg Code for ICCV 2021 paper: ARAPReg: An As-Rigid-As Possible Regularization Loss for Learning Deformable Shape Generators.. Installation The cod

Bo Sun 132 Nov 28, 2022
DirectVoxGO reconstructs a scene representation from a set of calibrated images capturing the scene.

DirectVoxGO reconstructs a scene representation from a set of calibrated images capturing the scene. We achieve NeRF-comparable novel-view synthesis quality with super-fast convergence.

sunset 709 Dec 31, 2022
TFOD-MASKRCNN - Tensorflow MaskRCNN With Python

Tensorflow- MaskRCNN Steps git clone https://github.com/amalaj7/TFOD-MASKRCNN.gi

Amal Ajay 2 Jan 18, 2022
Scaling and Benchmarking Self-Supervised Visual Representation Learning

FAIR Self-Supervision Benchmark is deprecated. Please see VISSL, a ground-up rewrite of benchmark in PyTorch. FAIR Self-Supervision Benchmark This cod

Meta Research 584 Dec 31, 2022
CowHerd is a partially-observed reinforcement learning environment

CowHerd is a partially-observed reinforcement learning environment, where the player walks around an area and is rewarded for milking cows. The cows try to escape and the player can place fences to h

Danijar Hafner 6 Mar 06, 2022
Code for Fully Context-Aware Image Inpainting with a Learned Semantic Pyramid

SPN: Fully Context-Aware Image Inpainting with a Learned Semantic Pyramid Code for Fully Context-Aware Image Inpainting with a Learned Semantic Pyrami

12 Jun 27, 2022
Use unsupervised and supervised learning to predict stocks

AIAlpha: Multilayer neural network architecture for stock return prediction This project is meant to be an advanced implementation of stacked neural n

Vivek Palaniappan 1.5k Dec 26, 2022
An LSTM based GAN for Human motion synthesis

GAN-motion-Prediction An LSTM based GAN for motion synthesis has a few issues reading H3.6M data from A.Jain et al , will fix soon. Prediction of the

Amogh Adishesha 9 Jun 17, 2022
A High-Quality Real Time Upscaler for Anime Video

Anime4K Anime4K is a set of open-source, high-quality real-time anime upscaling/denoising algorithms that can be implemented in any programming langua

15.7k Jan 06, 2023
Code and training data for our ECCV 2016 paper on Unsupervised Learning

Shuffle and Learn (Shuffle Tuple) Created by Ishan Misra Based on the ECCV 2016 Paper - "Shuffle and Learn: Unsupervised Learning using Temporal Order

Ishan Misra 44 Dec 08, 2021
UV matrix decompostion using movielens dataset

UV-matrix-decompostion-with-kfold UV matrix decompostion using movielens dataset upload the 'ratings.dat' file install the following python libraries

2 Oct 18, 2022
CTF challenges from redpwnCTF 2021

redpwnCTF 2021 Challenges This repository contains challenges from redpwnCTF 2021 in the rCDS format; challenge information is in the challenge.yaml f

redpwn 27 Dec 07, 2022
Medical Image Segmentation using Squeeze-and-Expansion Transformers

Medical Image Segmentation using Squeeze-and-Expansion Transformers Introduction This repository contains the code of the IJCAI'2021 paper 'Medical Im

askerlee 172 Dec 20, 2022
Automatic library of congress classification, using word embeddings from book titles and synopses.

Automatic Library of Congress Classification The Library of Congress Classification (LCC) is a comprehensive classification system that was first deve

Ahmad Pourihosseini 3 Oct 01, 2022
LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT

LightHuBERT LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT | Github | Huggingface | SUPER

WangRui 46 Dec 29, 2022