(Py)TOD: Tensor-based Outlier Detection, A General GPU-Accelerated Framework

Overview

(Py)TOD: Tensor-based Outlier Detection, A General GPU-Accelerated Framework


Background: Outlier detection (OD) is a key data mining task for identifying abnormal objects from general samples with numerous high-stake applications including fraud detection and intrusion detection.

To scale outlier detection (OD) to large-scale, high-dimensional datasets, we propose TOD, a novel system that abstracts OD algorithms into basic tensor operations for efficient GPU acceleration.

The corresponding paper. The code is being cleaned up and released. Please watch and star!

One reason to use it:

On average, TOD is 11 times faster than PyOD!

If you need another reason: it can handle much larger datasets:more than a million sample OD within an hour!


TOD is featured for:

  • Unified APIs, detailed documentation, and examples for the easy use (under construction)
  • Supports more than 10 different OD algorithms and more are being added
  • TOD supports multi-GPU acceleration
  • Advanced techniques like provable quantization

Programming Model Interface

Complex OD algorithms can be abstracted into common tensor operators.

https://raw.githubusercontent.com/yzhao062/pytod/master/figs/abstraction.png

For instance, ABOD and COPOD can be assembled by the basic tensor operators.

https://raw.githubusercontent.com/yzhao062/pytod/master/figs/abstraction_example.png

End-to-end Performance Comparison with PyOD

Overall, it is much (on avg. 11 times) faster than PyOD takes way less run time.

https://raw.githubusercontent.com/yzhao062/pytod/master/figs/run_time.png

Code is being released. Watch and star for the latest news!

Comments
  • Error while installing package

    Error while installing package

    I installed Pytorch 1.10 from their site. It seen in virtual environment. I try pip install pytod but when searching for pytorch, it cannot find it because it searches with the "pytorch" package, not the "torch" package.

    ERROR: Could not find a version that satisfies the requirement pytorch>=1.7 (from pytod) (from versions: 0.1.2, 1.0.2)
    ERROR: No matching distribution found for pytorch>=1.7
    
    opened by nuriakiin 1
  • decision_function() returns None

    decision_function() returns None

    Thanks for the package. When I try to implement LOF (or KNN) decision_function() on test data returns empty object. Is there a fix to this? Following is the code that replicates the issue (on GPU):

    from pytod.models.lof import LOF import torch import numpy as np

    x = np.array([[1, 2], [3, 4], [5, 6], [7, 8], [75,80]], dtype=np.float32) x = torch.from_numpy(x)

    y = np.array([[6, 5], [1, 2], [3, 4], [5, 1], [11,12]], dtype=np.float32) y = torch.from_numpy(y)

    lof = LOF(n_neighbors=2, device = 'cuda:0')

    lof.fit(x)

    print(lof.decision_function(y))

    opened by sugatc 0
  • Support for novelty detection and changing distance metric with local outlier factor

    Support for novelty detection and changing distance metric with local outlier factor

    The current implementation of LOF doesn't allow changing the distance metric to 'cosine', for example or setting novelty = True which prevents it from being used for novelty detection task. It will be great if support can be added for these.

    opened by sugatc 2
  • can't fit model in colab

    can't fit model in colab

    when i try fit on any model in colab gpu instance i get the following error. my dataset has 2 columns and 1 million rows:


    AttributeError Traceback (most recent call last) in () 4 clf_name = 'KNN' 5 clf = LOF() ----> 6 clf.fit(X)

    3 frames /usr/local/lib/python3.7/dist-packages/pandas/core/generic.py in getattr(self, name) 5485 ): 5486 return self[name] -> 5487 return object.getattribute(self, name) 5488 5489 def setattr(self, name: str, value) -> None:

    AttributeError: 'DataFrame' object has no attribute 'to'

    opened by yairVanti 0
  • clean up reproducibility scripts

    clean up reproducibility scripts

    We are cleaning up these scripts for an easy run, while the primary results are reproducible with the compare_real_data.py (https://github.com/yzhao062/pytod/tree/main/reproducibility)

    enhancement 
    opened by yzhao062 0
Releases(v0.0.2)
  • v0.0.2(Jun 19, 2022)

    v<0.0.1>, <04/12/2021> -- Add LOF. v<0.0.1>, <04/23/2021> -- Add ABOD. v<0.0.2>, <06/19/2021> -- Add PCA and HBOS. v<0.0.2>, <06/19/2021> -- Turn on test suites.

    Now we have updated both the paper the repo to cover more algorithms.

    Source code(tar.gz)
    Source code(zip)
Owner
Yue Zhao
Ph.D. Student @ CMU. Outlier Detection Systems | ML Systems (MLSys) | Anomaly/Outlier Detection | AutoML. Twitter@ yzhao062
Yue Zhao
Code release for our paper, "SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo"

SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo Thomas Kollar, Michael Laskey, Kevin Stone, Brijen Thananjeyan

68 Dec 14, 2022
We present a framework for training multi-modal deep learning models on unlabelled video data by forcing the network to learn invariances to transformations applied to both the audio and video streams.

Multi-Modal Self-Supervision using GDT and StiCa This is an official pytorch implementation of papers: Multi-modal Self-Supervision from Generalized D

Facebook Research 42 Dec 09, 2022
Modular Probabilistic Programming on MXNet

MXFusion | | | | Tutorials | Documentation | Contribution Guide MXFusion is a modular deep probabilistic programming library. With MXFusion Modules yo

Amazon 100 Dec 10, 2022
Virtual Dance Reality Stage: a feature that offers you to share a stage with another user virtually

Portrait Segmentation using Tensorflow This script removes the background from an input image. You can read more about segmentation here Setup The scr

291 Dec 24, 2022
A python software that can help blind people find things like laptops, phones, etc the same way a guide dog guides a blind person in finding his way.

GuidEye A python software that can help blind people find things like laptops, phones, etc the same way a guide dog guides a blind person in finding h

Munal Jain 0 Aug 09, 2022
A Python package for time series augmentation

tsaug tsaug is a Python package for time series augmentation. It offers a set of augmentation methods for time series, as well as a simple API to conn

Arundo Analytics 278 Jan 01, 2023
SoGCN: Second-Order Graph Convolutional Networks

SoGCN: Second-Order Graph Convolutional Networks This is the authors' implementation of paper "SoGCN: Second-Order Graph Convolutional Networks" in Py

Yuehao 7 Aug 16, 2022
The source code for Adaptive Kernel Graph Neural Network at AAAI2022

AKGNN The source code for Adaptive Kernel Graph Neural Network at AAAI2022. Please cite our paper if you think our work is helpful to you: @inproceedi

11 Nov 25, 2022
Code for the paper "How Attentive are Graph Attention Networks?"

How Attentive are Graph Attention Networks? This repository is the official implementation of How Attentive are Graph Attention Networks?. The PyTorch

175 Dec 29, 2022
A Pytorch implementation of MoveNet from Google. Include training code and pre-train model.

Movenet.Pytorch Intro MoveNet is an ultra fast and accurate model that detects 17 keypoints of a body. This is A Pytorch implementation of MoveNet fro

Mr.Fire 241 Dec 26, 2022
This computer program provides a reference implementation of Lagrangian Monte Carlo in metric induced by the Monge patch

This computer program provides a reference implementation of Lagrangian Monte Carlo in metric induced by the Monge patch. The code was prepared to the final version of the accepted manuscript in AIST

Marcelo Hartmann 2 May 06, 2022
The codebase for our paper "Generative Occupancy Fields for 3D Surface-Aware Image Synthesis" (NeurIPS 2021)

Generative Occupancy Fields for 3D Surface-Aware Image Synthesis (NeurIPS 2021) Project Page | Paper Xudong Xu, Xingang Pan, Dahua Lin and Bo Dai GOF

xuxudong 97 Nov 10, 2022
XtremeDistil framework for distilling/compressing massive multilingual neural network models to tiny and efficient models for AI at scale

XtremeDistilTransformers for Distilling Massive Multilingual Neural Networks ACL 2020 Microsoft Research [Paper] [Video] Releasing [XtremeDistilTransf

Microsoft 125 Jan 04, 2023
Similarity-based Gray-box Adversarial Attack Against Deep Face Recognition

Similarity-based Gray-box Adversarial Attack Against Deep Face Recognition Introduction Run attack: SGADV.py Objective function: foolbox/attacks/gradi

1 Jul 18, 2022
Model parallel transformers in Jax and Haiku

Mesh Transformer Jax A haiku library using the new(ly documented) xmap operator in Jax for model parallelism of transformers. See enwik8_example.py fo

Ben Wang 4.8k Jan 01, 2023
Some bravo or inspiring research works on the topic of curriculum learning.

Towards Scalable Unpaired Virtual Try-On via Patch-Routed Spatially-Adaptive GAN Official code for NeurIPS 2021 paper "Towards Scalable Unpaired Virtu

131 Jan 07, 2023
Spherical Confidence Learning for Face Recognition, accepted to CVPR2021.

Sphere Confidence Face (SCF) This repository contains the PyTorch implementation of Sphere Confidence Face (SCF) proposed in the CVPR2021 paper: Shen

Maths 70 Dec 09, 2022
This repository contains a pytorch implementation of "StereoPIFu: Depth Aware Clothed Human Digitization via Stereo Vision".

StereoPIFu: Depth Aware Clothed Human Digitization via Stereo Vision | Project Page | Paper | This repository contains a pytorch implementation of "St

87 Dec 09, 2022
Captcha-tensorflow - Image Captcha Solving Using TensorFlow and CNN Model. Accuracy 90%+

Captcha Solving Using TensorFlow Introduction Solve captcha using TensorFlow. Learn CNN and TensorFlow by a practical project. Follow the steps, run t

Jackon Yang 869 Jan 06, 2023