(Py)TOD: Tensor-based Outlier Detection, A General GPU-Accelerated Framework

Last update: Jan 05, 2023

Overview

(Py)TOD: Tensor-based Outlier Detection, A General GPU-Accelerated Framework

Background: Outlier detection (OD) is a key data mining task for identifying abnormal objects from general samples with numerous high-stake applications including fraud detection and intrusion detection.

To scale outlier detection (OD) to large-scale, high-dimensional datasets, we propose TOD, a novel system that abstracts OD algorithms into basic tensor operations for efficient GPU acceleration.

The corresponding paper. The code is being cleaned up and released. Please watch and star!

One reason to use it:

On average, TOD is 11 times faster than PyOD!

If you need another reason: it can handle much larger datasets:more than a million sample OD within an hour!

TOD is featured for:

Unified APIs, detailed documentation, and examples for the easy use (under construction)
Supports more than 10 different OD algorithms and more are being added
TOD supports multi-GPU acceleration
Advanced techniques like provable quantization

Programming Model Interface

Complex OD algorithms can be abstracted into common tensor operators.

https://raw.githubusercontent.com/yzhao062/pytod/master/figs/abstraction.png

For instance, ABOD and COPOD can be assembled by the basic tensor operators.

https://raw.githubusercontent.com/yzhao062/pytod/master/figs/abstraction_example.png

End-to-end Performance Comparison with PyOD

Overall, it is much (on avg. 11 times) faster than PyOD takes way less run time.

https://raw.githubusercontent.com/yzhao062/pytod/master/figs/run_time.png

Code is being released. Watch and star for the latest news!

Comments

Error while installing package
I installed Pytorch 1.10 from their site. It seen in virtual environment. I try pip install pytod but when searching for pytorch, it cannot find it because it searches with the "pytorch" package, not the "torch" package.

ERROR: Could not find a version that satisfies the requirement pytorch>=1.7 (from pytod) (from versions: 0.1.2, 1.0.2) ERROR: No matching distribution found for pytorch>=1.7
opened by nuriakiin 1
decision_function() returns None

Thanks for the package. When I try to implement LOF (or KNN) decision_function() on test data returns empty object. Is there a fix to this? Following is the code that replicates the issue (on GPU):

from pytod.models.lof import LOF import torch import numpy as np

x = np.array([[1, 2], [3, 4], [5, 6], [7, 8], [75,80]], dtype=np.float32) x = torch.from_numpy(x)

y = np.array([[6, 5], [1, 2], [3, 4], [5, 1], [11,12]], dtype=np.float32) y = torch.from_numpy(y)

lof = LOF(n_neighbors=2, device = 'cuda:0')

lof.fit(x)

print(lof.decision_function(y))

opened by sugatc 0
Support for novelty detection and changing distance metric with local outlier factor

The current implementation of LOF doesn't allow changing the distance metric to 'cosine', for example or setting novelty = True which prevents it from being used for novelty detection task. It will be great if support can be added for these.

opened by sugatc 2
can't fit model in colab

when i try fit on any model in colab gpu instance i get the following error. my dataset has 2 columns and 1 million rows:

AttributeError Traceback (most recent call last) in () 4 clf_name = 'KNN' 5 clf = LOF() ----> 6 clf.fit(X)

3 frames /usr/local/lib/python3.7/dist-packages/pandas/core/generic.py in getattr(self, name) 5485 ): 5486 return self[name] -> 5487 return object.getattribute(self, name) 5488 5489 def setattr(self, name: str, value) -> None:

AttributeError: 'DataFrame' object has no attribute 'to'

opened by yairVanti 0
clean up reproducibility scripts

We are cleaning up these scripts for an easy run, while the primary results are reproducible with the compare_real_data.py (https://github.com/yzhao062/pytod/tree/main/reproducibility)
enhancement

opened by yzhao062 0

Releases(v0.0.2)

v0.0.2(Jun 19, 2022)

v<0.0.1>, <04/12/2021> -- Add LOF. v<0.0.1>, <04/23/2021> -- Add ABOD. v<0.0.2>, <06/19/2021> -- Add PCA and HBOS. v<0.0.2>, <06/19/2021> -- Turn on test suites.

Now we have updated both the paper the repo to cover more algorithms.
Source code(tar.gz)
Source code(zip)

Owner

Yue Zhao

Ph.D. Student @ CMU. Outlier Detection Systems | ML Systems (MLSys) | Anomaly/Outlier Detection | AutoML. Twitter@ yzhao062

GitHub Repository https://www.andrew.cmu.edu/user/yuezhao2/papers/21-preprint-tod.pdf

Source code for ZePHyR: Zero-shot Pose Hypothesis Rating @ ICRA 2021

ZePHyR: Zero-shot Pose Hypothesis Rating ZePHyR is a zero-shot 6D object pose estimation pipeline. The core is a learned scoring function that compare

18 Aug 22, 2022

1st Solution For NeurIPS 2021 Competition on ML4CO Dual Task

KIDA: Knowledge Inheritance in Data Aggregation This project releases our 1st place solution on NeurIPS2021 ML4CO Dual Task. Slide and model weights a

24 Sep 08, 2022

This is the official implementation of the paper "Object Propagation via Inter-Frame Attentions for Temporally Stable Video Instance Segmentation".

[CVPRW 2021] - Object Propagation via Inter-Frame Attentions for Temporally Stable Video Instance Segmentation

6 May 03, 2022

Apollo optimizer in tensorflow

Apollo Optimizer in Tensorflow 2.x Notes: Warmup is important with Apollo optimizer, so be sure to pass in a learning rate schedule vs. a constant lea

1 Nov 09, 2021

ML model to classify between cats and dogs

Cats-and-dogs-classifier This is my first ML model which can classify between cats and dogs. Here the accuracy is around 75%, however , the accuracy c

4 Aug 20, 2021

MVGCN: a novel multi-view graph convolutional network (MVGCN) framework for link prediction in biomedical bipartite networks.

MVGCN MVGCN: a novel multi-view graph convolutional network (MVGCN) framework for link prediction in biomedical bipartite networks. Developer: Fu Hait

13 Dec 01, 2022

AI virtual gym is an AI program which can be used to exercise and can be used to see if we are doing the exercises

4 Feb 13, 2022

Official PyTorch implementation for paper "Efficient Two-Stage Detection of Human–Object Interactions with a Novel Unary–Pairwise Transformer"

UPT: Unary–Pairwise Transformers This repository contains the official PyTorch implementation for the paper Frederic Z. Zhang, Dylan Campbell and Step

109 Dec 20, 2022

Unofficial PyTorch Implementation of UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation

UnivNet UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation This is an unofficial PyTorch

170 Jan 04, 2023

Implementation of UNet on the Joey ML framework

Independent Research Project - Code Joey can be cloned from here https://github.com/devitocodes/joey/. Devito and other dependencies such as PyTorch a

1 Oct 21, 2021

Dynamica causal Bayesian optimisation

Dynamic Causal Bayesian Optimization This is a Python implementation of Dynamic Causal Bayesian Optimization as presented at NeurIPS 2021. Abstract Th

18 Nov 22, 2022

Official repository for "Deep Recurrent Neural Network with Multi-scale Bi-directional Propagation for Video Deblurring".

RNN-MBP Deep Recurrent Neural Network with Multi-scale Bi-directional Propagation for Video Deblurring (AAAI-2022) by Chao Zhu, Hang Dong, Jinshan Pan

22 Aug 31, 2022

Robotics environments

Robotics environments Details and documentation on these robotics environments are available in OpenAI's blog post and the accompanying technical repo

121 Dec 28, 2022

A unofficial pytorch implementation of PAN(PSENet2): Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network

Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network Requirements pytorch 1.1+ torchvision 0.3+ pyclipper opencv3 gcc

400 Dec 26, 2022

(Py)TOD: Tensor-based Outlier Detection, A General GPU-Accelerated Framework

Related tags

Overview

(Py)TOD: Tensor-based Outlier Detection, A General GPU-Accelerated Framework

One reason to use it:

Programming Model Interface

End-to-end Performance Comparison with PyOD

Comments

Error while installing package

decision_function() returns None

Support for novelty detection and changing distance metric with local outlier factor

can't fit model in colab

clean up reproducibility scripts

Releases(v0.0.2)

v0.0.2(Jun 19, 2022)

Owner

Yue Zhao

Source code for ZePHyR: Zero-shot Pose Hypothesis Rating @ ICRA 2021

1st Solution For NeurIPS 2021 Competition on ML4CO Dual Task

This is the official implementation of the paper "Object Propagation via Inter-Frame Attentions for Temporally Stable Video Instance Segmentation".

Apollo optimizer in tensorflow

ML model to classify between cats and dogs

MVGCN: a novel multi-view graph convolutional network (MVGCN) framework for link prediction in biomedical bipartite networks.

AI virtual gym is an AI program which can be used to exercise and can be used to see if we are doing the exercises

Official PyTorch implementation for paper "Efficient Two-Stage Detection of Human–Object Interactions with a Novel Unary–Pairwise Transformer"

Unofficial PyTorch Implementation of UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation

Implementation of UNet on the Joey ML framework

Dynamica causal Bayesian optimisation

Official repository for "Deep Recurrent Neural Network with Multi-scale Bi-directional Propagation for Video Deblurring".

Robotics environments

A unofficial pytorch implementation of PAN(PSENet2): Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network

Code for the Paper "Diffusion Models for Handwriting Generation"

RSNA Intracranial Hemorrhage Detection with python

A curated list of resources for Image and Video Deblurring

DAT4 - General Assembly's Data Science course in Washington, DC

Simultaneous Demand Prediction and Planning

Text to Image Generation with Semantic-Spatial Aware GAN