Algorithms for outlier, adversarial and drift detection

Last update: Dec 31, 2022

Overview

Alibi Detect is an open source Python library focused on outlier, adversarial and drift detection. The package aims to cover both online and offline detectors for tabular data, text, images and time series. Both TensorFlow and PyTorch backends are supported for drift detection.

Documentation

For more background on the importance of monitoring outliers and distributions in a production setting, check out this talk from the Challenges in Deploying and Monitoring Machine Learning Systems ICML 2020 workshop, based on the paper Monitoring and explainability of models in production and referencing Alibi Detect.

For a thorough introduction to drift detection, check out Protecting Your Machine Learning Against Drift: An Introduction. The talk covers what drift is and why it pays to detect it, the different types of drift, how it can be detected in a principled manner and also describes the anatomy of a drift detector.

Installation and Usage
Supported Algorithms
Datasets
Models
Integrations
Citations

Installation and Usage

alibi-detect can be installed from PyPI:

pip install alibi-detect

Alternatively, the development version can be installed:

pip install git+https://github.com/SeldonIO/alibi-detect.git

To use the Prophet time series outlier detector:

pip install alibi-detect[prophet]

We will use the VAE outlier detector to illustrate the API.

from alibi_detect.od import OutlierVAE
from alibi_detect.utils import save_detector, load_detector

# initialize and fit detector
od = OutlierVAE(threshold=0.1, encoder_net=encoder_net, decoder_net=decoder_net, latent_dim=1024)
od.fit(x_train)

# make predictions
preds = od.predict(x_test)

# save and load detectors
filepath = './my_detector/'
save_detector(od, filepath)
od = load_detector(filepath)

The predictions are returned in a dictionary with as keys meta and data. meta contains the detector's metadata while data is in itself a dictionary with the actual predictions. It contains the outlier, adversarial or drift scores and thresholds as well as the predictions whether instances are e.g. outliers or not. The exact details can vary slightly from method to method, so we encourage the reader to become familiar with the types of algorithms supported.

The save and load functionality for the Prophet time series outlier detector is currently experiencing issues in Python 3.6 but works in Python 3.7.

Supported Algorithms

The following tables show the advised use cases for each algorithm. The column Feature Level indicates whether the detection can be done at the feature level, e.g. per pixel for an image. Check the algorithm reference list for more information with links to the documentation and original papers as well as examples for each of the detectors.

Outlier Detection

Detector	Tabular	Image	Time Series	Categorical Features	Online	Feature Level
Isolation Forest	✔			✔
Mahalanobis Distance	✔			✔	✔
AE	✔	✔				✔
VAE	✔	✔				✔
AEGMM	✔	✔
VAEGMM	✔	✔
Likelihood Ratios	✔	✔	✔	✔		✔
Prophet			✔
Spectral Residual			✔		✔	✔
Seq2Seq			✔			✔

Adversarial Detection

Detector	Tabular	Image	Time Series	Text	Categorical Features	Online	Feature Level
Adversarial AE	✔	✔
Model distillation	✔	✔	✔	✔	✔

Drift Detection

Detector	Tabular	Image	Time Series	Text	Categorical Features	Online	Feature Level
Kolmogorov-Smirnov	✔	✔		✔	✔		✔
Cramér-von Mises	✔	✔				✔	✔
Fisher's Exact Test	✔				✔	✔	✔
Maximum Mean Discrepancy	✔	✔		✔	✔	✔
Learned Kernel MMD	✔	✔		✔	✔
Least-Squares Density Difference	✔	✔		✔	✔	✔
Chi-Squared	✔				✔		✔
Mixed-type tabular data	✔				✔		✔
Classifier	✔	✔	✔	✔	✔
Spot-the-diff	✔	✔	✔	✔	✔		✔
Classifier Uncertainty	✔	✔	✔	✔	✔
Regressor Uncertainty	✔	✔	✔	✔	✔

TensorFlow and PyTorch support

The drift detectors support TensorFlow and PyTorch backends. Alibi Detect does however not install PyTorch for you. Check the PyTorch docs how to do this. Example:

from alibi_detect.cd import MMDDrift

cd = MMDDrift(x_ref, backend='tensorflow', p_val=.05)
preds = cd.predict(x)

The same detector in PyTorch:

cd = MMDDrift(x_ref, backend='pytorch', p_val=.05)
preds = cd.predict(x)

Built-in preprocessing steps

Alibi Detect also comes with various preprocessing steps such as randomly initialized encoders, pretrained text embeddings to detect drift on using the transformers library and extraction of hidden layers from machine learning models. This allows to detect different types of drift such as covariate and predicted distribution shift. The preprocessing steps are again supported in TensorFlow and PyTorch.

from alibi_detect.cd.tensorflow import HiddenOutput, preprocess_drift

model = # TensorFlow model; tf.keras.Model or tf.keras.Sequential
preprocess_fn = partial(preprocess_drift, model=HiddenOutput(model, layer=-1), batch_size=128)
cd = MMDDrift(x_ref, backend='tensorflow', p_val=.05, preprocess_fn=preprocess_fn)
preds = cd.predict(x)

Check the example notebooks (e.g. CIFAR10, movie reviews) for more details.

Reference List

Datasets

The package also contains functionality in alibi_detect.datasets to easily fetch a number of datasets for different modalities. For each dataset either the data and labels or a Bunch object with the data, labels and optional metadata are returned. Example:

from alibi_detect.datasets import fetch_ecg

(X_train, y_train), (X_test, y_test) = fetch_ecg(return_X_y=True)

Sequential Data and Time Series

Genome Dataset: fetch_genome
- Bacteria genomics dataset for out-of-distribution detection, released as part of Likelihood Ratios for Out-of-Distribution Detection. From the original TL;DR: The dataset contains genomic sequences of 250 base pairs from 10 in-distribution bacteria classes for training, 60 OOD bacteria classes for validation, and another 60 different OOD bacteria classes for test. There are respectively 1, 7 and again 7 million sequences in the training, validation and test sets. For detailed info on the dataset check the README.
```
from alibi_detect.datasets import fetch_genome

(X_train, y_train), (X_val, y_val), (X_test, y_test) = fetch_genome(return_X_y=True)
```
ECG 5000: fetch_ecg
- 5000 ECG's, originally obtained from Physionet.
NAB: fetch_nab
- Any univariate time series in a DataFrame from the Numenta Anomaly Benchmark. A list with the available time series can be retrieved using alibi_detect.datasets.get_list_nab().

Images

CIFAR-10-C: fetch_cifar10c
- CIFAR-10-C (Hendrycks & Dietterich, 2019) contains the test set of CIFAR-10, but corrupted and perturbed by various types of noise, blur, brightness etc. at different levels of severity, leading to a gradual decline in a classification model's performance trained on CIFAR-10. fetch_cifar10c allows you to pick any severity level or corruption type. The list with available corruption types can be retrieved with alibi_detect.datasets.corruption_types_cifar10c(). The dataset can be used in research on robustness and drift. The original data can be found here. Example:
```
from alibi_detect.datasets import fetch_cifar10c

corruption = ['gaussian_noise', 'motion_blur', 'brightness', 'pixelate']
X, y = fetch_cifar10c(corruption=corruption, severity=5, return_X_y=True)
```
Adversarial CIFAR-10: fetch_attack
- Load adversarial instances on a ResNet-56 classifier trained on CIFAR-10. Available attacks: Carlini-Wagner ('cw') and SLIDE ('slide'). Example:
```
from alibi_detect.datasets import fetch_attack

(X_train, y_train), (X_test, y_test) = fetch_attack('cifar10', 'resnet56', 'cw', return_X_y=True)
```

Tabular

KDD Cup '99: fetch_kdd
- Dataset with different types of computer network intrusions. fetch_kdd allows you to select a subset of network intrusions as targets or pick only specified features. The original data can be found here.

Models

Models and/or building blocks that can be useful outside of outlier, adversarial or drift detection can be found under alibi_detect.models. Main implementations:

PixelCNN++: alibi_detect.models.pixelcnn.PixelCNN
Variational Autoencoder: alibi_detect.models.autoencoder.VAE
Sequence-to-sequence model: alibi_detect.models.autoencoder.Seq2Seq
ResNet: alibi_detect.models.resnet
- Pre-trained ResNet-20/32/44 models on CIFAR-10 can be found on our Google Cloud Bucket and can be fetched as follows:
```
from alibi_detect.utils.fetching import fetch_tf_model

model = fetch_tf_model('cifar10', 'resnet32')
```

Integrations

Alibi-detect is integrated in the open source machine learning model deployment platform Seldon Core and model serving framework KFServing.

Seldon Core: outlier and drift detection worked examples.
KFServing: outlier and drift detection examples.

Citations

If you use alibi-detect in your research, please consider citing it.

BibTeX entry:

@software{alibi-detect,
  title = {Alibi Detect: Algorithms for outlier, adversarial and drift detection},
  author = {Van Looveren, Arnaud and Klaise, Janis and Vacanti, Giovanni and Cobb, Oliver and Scillitoe, Ashley and Samoilescu, Robert},
  url = {https://github.com/SeldonIO/alibi-detect},
  version = {0.8.0},
  date = {2021-12-09},
  year = {2019}
}

Comments

Config driven detectors - part 3
This is the third part of a series of PR's for the config-driven detector functionality. The original PR (https://github.com/SeldonIO/alibi-detect/pull/389) has been split into a number of smaller PR's to aid the review process.

Summary of PR

This PR implements the main save and load functionality, and related docs and testing. For more details on the overall config-based save/load strategy, refer to the original PR #389.

Details

The mainstay of this PR is contained in utils/saving.py, and the newly created utils/loading.py.

Additional details:

The top-level public functions of interest are save_detector and load_detector, both of which have been reworked to write/read config.toml files (in the case of drift detectors). Other detectors are still saved to the legacy .dill format, and support is retained for all detectors to read these legacy files. (to avoid having to regen all remote artifacts immediately).

Loading functionality has been moved from utils/saving.py to utils/loading.py, since the loading submodule is now larger and is expected to become larger still in the future. A deprecation warning is raised (bit it still works) when from alibi_detect.utils.saving import load_detector is called.

The backend-specifc bits of saving.py and loading.py have been factored out into tensorflow/_saving.py etc, in preperation for the soon-to-be-built PyTorch/sklearn save/load functionality. This also means the file-wide type: ignore's can be removed.

The legacy save/load code has been moved to tensorflow/_saving.py and tensorflow/_loading.py, since in reality this was all tensorflow-specific.

Fine details will be given in code comments below.

Ounstanding decisions

[x] Currently the top-level backend config field is used to declare the backend for backend-specific detectors. But it is also used to set the expected backend for all preprocessing models and kernels etc. Is this OK (at least for now), or do we want specific flags in the model (or preprocess_fn) configs? Part of this decision might depend on whether we ever envisage one library being used for the backend whilst another is used for preprocessing. I can't see this being sensible for PyTorch/Tensorflow, but perhaps for sklearn? - To be addressed in a subsequent PR.

[x] At the moment the following functions are public:

save_detector/load_detector - to save/load detector

write_config/read_config - to write/read config.toml from/into config dict

validate_config - to validate config dict

All other functions in saving/loading.py are private, and the tensorflow/_saving.py etc is also private. A number of functions exist to save/load artefact configs, e.g. _load_model_config. Making these public could be useful for some users, e.g. so the model section of a config.toml could be loaded in isolation for debugging, or written runtime model to assist with config.toml generation. However making these public will hinder future code changes, so I'm inclined to leave them private unitl the config functionality is more stable?

[x] The current options for model type (e.g. 'UAE', 'HiddenOutput', 'custom') have been carried over from the legacy save/load code. We could do with rethinking what is required here.

[x] Find an example where custom_objects is needed, and investigate how these objects are defined. It might be easier to remove support for this for now. - Removed for now, will be added back in a later PR.

[x] In resolve_cfg, we only attempt to resolve fields in the config.toml which are listed in FIELDS_TO_RESOLVE. This has the advantage of avoiding infinite recursion, and also allows us to easily tell downstream deps (e.g. MLServer etc) what fields could potentially point to artefacts that will need resolving (such a list has been requested before). However, it is messy, and complicates the resolution of more generic containers such as custom_objects. If we assume that only a validated config will be passed to resolve_cfg (so we are certain of its structure), I'm wondering if we should go back to more generic recursion here? - Left for now. Can change later if there is a good reason.

Post PR TODO's (to be consolidated into issues)

Adding custom_obj back in. See above bullet, and https://github.com/SeldonIO/alibi-detect/pull/469#discussion_r837554116.

How to handle warnings when load->save->load across different versions. See https://github.com/SeldonIO/alibi-detect/pull/469#discussion_r847118355.

Specifying backend for preprocessing and models separately? See above, and https://github.com/SeldonIO/alibi-detect/pull/469#issuecomment-1082989425.

Saving/loading state, outlier and adversarial detectors, online detectors, and PyTorch support.

Improve our registry submodule. See https://github.com/SeldonIO/alibi-detect/pull/469#discussion_r836576042.

Use enum for backend etc. See https://github.com/SeldonIO/alibi-detect/pull/469#discussion_r837547884.

Improve the alibi_detect.utils.schemas api page, either with custom templates or using autodoc-pydantic.

Turn off undoc-members. This will require docstrings to be added to public objects that are currently missing docstrings.

Python dunder methods to test equality of specific detectors. See https://github.com/SeldonIO/alibi-detect/pull/469#discussion_r837755055.

Investigate randomness with Tensorflow and Pytorch based detectors, so that we can properly test the loaded and original detector predictions. See https://github.com/SeldonIO/alibi-detect/pull/469#discussion_r838525825.

Run isort on entire code base, and consider adding a check to CI.
opened by ascillitoe 17
RuntimeError: expected scalar type Long but found Int in [cd_spot_the_diff_mnist_wine.ipynb]

I am getting the following error when trying to execute code (in [10] section "Interpretable Drift Detection on the Wine Quality Dataset"):

RuntimeError Traceback (most recent call last) ~\AppData\Local\Temp/ipykernel_3564/2414863533.py in 10 ) 11 ---> 12 preds_h0 = cd.predict(x_h0) 13 preds_corr = cd.predict(x_corr)

~\AppData\Roaming\Python\Python39\site-packages\alibi_detect\cd\spot_the_diff.py in predict(self, x, return_p_val, return_distance, return_probs, return_model) 173 data, and the trained model. 174 """ --> 175 return self._detector.predict(x, return_p_val, return_distance, return_probs, return_model)

~\AppData\Roaming\Python\Python39\site-packages\alibi_detect\cd\pytorch\spot_the_diff.py in predict(self, x, return_p_val, return_distance, return_probs, return_model) 212 data, and the trained model. 213 """ --> 214 preds = self._detector.predict(x, return_p_val, return_distance, return_probs, return_model=True) 215 preds['data']['diffs'] = preds['data']['model'].diffs.detach().cpu().numpy() # type: ignore 216 preds['data']['diff_coeffs'] = preds['data']['model'].coeffs.detach().cpu().numpy() # type: ignore

~\AppData\Roaming\Python\Python39\site-packages\alibi_detect\cd\base.py in predict(self, x, return_p_val, return_distance, return_probs, return_model) 241 """ 242 # compute drift scores --> 243 p_val, dist, probs_ref, probs_test = self.score(x) 244 drift_pred = int(p_val < self.p_val) 245

~\AppData\Roaming\Python\Python39\site-packages\alibi_detect\cd\pytorch\classifier.py in score(self, x) 182 self.model = self.model.to(self.device) 183 train_args = [self.model, self.loss_fn, dl_tr, self.device] --> 184 trainer(*train_args, **self.train_kwargs) # type: ignore 185 preds = self.predict_fn(x_te, self.model.eval()) 186 preds_oof_list.append(preds)

~\AppData\Roaming\Python\Python39\site-packages\alibi_detect\models\pytorch\trainer.py in trainer(model, loss_fn, dataloader, device, optimizer, learning_rate, preprocess_fn, epochs, reg_loss_fn, verbose) 53 y_hat = model(x) 54 optimizer.zero_grad() # type: ignore ---> 55 loss = loss_fn(y_hat, y) + reg_loss_fn(model) 56 loss.backward() 57 optimizer.step() # type: ignore

~\anaconda3\lib\site-packages\torch\nn\modules\module.py in _call_impl(self, *input, **kwargs) 1100 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks 1101 or _global_forward_hooks or _global_forward_pre_hooks): -> 1102 return forward_call(*input, **kwargs) 1103 # Do not call functions when jit is used 1104 full_backward_hooks, non_full_backward_hooks = [], []

~\anaconda3\lib\site-packages\torch\nn\modules\loss.py in forward(self, input, target) 1148 1149 def forward(self, input: Tensor, target: Tensor) -> Tensor: -> 1150 return F.cross_entropy(input, target, weight=self.weight, 1151 ignore_index=self.ignore_index, reduction=self.reduction, 1152 label_smoothing=self.label_smoothing)

~\anaconda3\lib\site-packages\torch\nn\functional.py in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction, label_smoothing) 2844 if size_average is not None or reduce is not None: 2845 reduction = _Reduction.legacy_get_string(size_average, reduce) -> 2846 return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing) 2847 2848

RuntimeError: expected scalar type Long but found Int

I use Python 3.8.8/ Win10 installed on the AMD Ryzen with integrated graphics (AMD).
Priority: High Type: Bug

opened by tomaszek0 17
Drift detection for mixed datasets - KS and Chi Square

Hi all,

Hope this isn't too out of place to do this - but this was brought on by my own needs after using alibi-detect! Due to using data sets that have both discrete and continuous variables in, the KS drift couldn't work on its own. So, I adapted the feature_score() function to allow a switch between KS and Chi-square.

Variables are automatically detected to be either discrete or continuous. If the unique count of values is < 5% of all values (unique values / size of overall sample), then a column is considered discrete.

I've inherited from the KSDrift class, as I only really wanted to overwrite one function and then add another.

Happy to make any changes to improve!

Thanks,

Rob

opened by Clusks 16
Update numba requirement from !=0.54.0,<0.56.0,>=0.50.0 to >=0.50.0,!=0.54.0,<0.57.0
Updates the requirements on numba to permit the latest version.

Release notes

Sourced from numba's releases.

Version 0.56.0

This release continues to add new features, bug fixes and stability improvements to Numba. Please note that this will be the last release that has support for Python 3.7 as the next release series (Numba 0.57) will support Python 3.11! Also note that, this will be the last release to support linux-32 packages produced by the Numba team.

Commits

f75c45a Merge pull request #8279 from sklam/misc/rel0.56cherry

90671af Fix incorrect merge

2717e7a Merge pull request #8275 from stuartarchibald/wip/change_log_056_final

d617754 Merge pull request #8274 from stuartarchibald/doc/056_version_support_update

d42d598 Merge pull request #8269 from sklam/misc/cherry8255

a375ad7 Merge pull request #8255 from gmarkall/issue-8252

ea84072 Merge pull request #8205 from esc/pin_llvmlite_numpy

99a231e adding 8205 to CHANGE_LOG

bfd8290 clamp NumPy version at 1.22

f789aab pin llvmlite to 0.39.*

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

dependencies
opened by dependabot[bot] 15

OSError: [WinError 127]Error loading "D:\TODO\Python37_9\lib\site-packages\torch\lib\cublas64_11.dll" or one of its dependencies.

from alibi_detect.datasets import fetch_kdd I can use tensorflow and pytorch normally, and cublas64_11.dll exists.

>>>print(tf.__version__)
2.4.1
>>> print(torch.__version__)
1.9.0+cu111
>>> print(alibi_detect.__version__)
0.8.0

The complete information of error reporting is as follows

line 1, in <module>
    from alibi_detect.datasets import fetch_kdd
  File "D:\TODO\Python37_9\lib\site-packages\alibi_detect\__init__.py", line 1, in <module>
    from . import ad, cd, models, od, utils
  File "D:\TODO\Python37_9\lib\site-packages\alibi_detect\ad\__init__.py", line 1, in <module>
    from .adversarialae import AdversarialAE
  File "D:\TODO\Python37_9\lib\site-packages\alibi_detect\ad\adversarialae.py", line 8, in <module>
    from alibi_detect.models.tensorflow.autoencoder import AE
  File "D:\TODO\Python37_9\lib\site-packages\alibi_detect\models\tensorflow\__init__.py", line 2, in <module>
    from .embedding import TransformerEmbedding
  File "D:\TODO\Python37_9\lib\site-packages\alibi_detect\models\tensorflow\embedding.py", line 3, in <module>
    from transformers import TFAutoModel, AutoConfig
  File "D:\TODO\Python37_9\lib\site-packages\transformers\__init__.py", line 43, in <module>
    from . import dependency_versions_check
  File "D:\TODO\Python37_9\lib\site-packages\transformers\dependency_versions_check.py", line 36, in <module>
    from .file_utils import is_tokenizers_available
  File "D:\TODO\Python37_9\lib\site-packages\transformers\file_utils.py", line 52, in <module>
    from huggingface_hub import HfFolder, Repository, create_repo, list_repo_files, whoami
  File "D:\TODO\Python37_9\lib\site-packages\huggingface_hub\__init__.py", line 59, in <module>
    from .hub_mixin import ModelHubMixin, PyTorchModelHubMixin
  File "D:\TODO\Python37_9\lib\site-packages\huggingface_hub\hub_mixin.py", line 16, in <module>
    import torch
  File "D:\TODO\Python37_9\lib\site-packages\torch\__init__.py", line 124, in <module>
    raise err

Very strangely, info about tf appears twice

2022-01-14 15:35:06.884649: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2022-01-14 15:35:10.097446: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2022-01-14 15:35:10.098949: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library nvcuda.dll
2022-01-14 15:35:10.141661: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 2060 computeCapability: 7.5
coreClock: 1.2GHz coreCount: 30 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 245.91GiB/s
2022-01-14 15:35:10.142772: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2022-01-14 15:35:10.161065: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2022-01-14 15:35:10.161345: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
2022-01-14 15:35:10.177333: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cufft64_10.dll
2022-01-14 15:35:10.182175: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library curand64_10.dll
2022-01-14 15:35:10.193852: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusolver64_10.dll
2022-01-14 15:35:10.202690: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusparse64_11.dll
2022-01-14 15:35:10.204346: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll
2022-01-14 15:35:10.204689: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2022-01-14 15:35:10.205211: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-01-14 15:35:10.206735: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 2060 computeCapability: 7.5
coreClock: 1.2GHz coreCount: 30 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 245.91GiB/s
2022-01-14 15:35:10.207406: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2022-01-14 15:35:10.207746: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2022-01-14 15:35:10.208085: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
2022-01-14 15:35:10.208425: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cufft64_10.dll
2022-01-14 15:35:10.208766: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library curand64_10.dll
2022-01-14 15:35:10.209095: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusolver64_10.dll
2022-01-14 15:35:10.209408: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusparse64_11.dll
2022-01-14 15:35:10.209695: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll
2022-01-14 15:35:10.210058: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2022-01-14 15:35:11.154813: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1261] Device interconnect StreamExecutor with strength 1 edge matrix:
2022-01-14 15:35:11.155087: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1267]      0 
2022-01-14 15:35:11.155249: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1280] 0:   N 
2022-01-14 15:35:11.155625: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1406] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4720 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 2060, pci bus id: 0000:01:00.0, compute capability: 7.5)
2022-01-14 15:35:11.157391: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
Traceback (most recent call last):
  File "C:/Users/lenovo/Desktop/DO/cs.py", line 1, in <module>
    from alibi_detect.datasets import fetch_kdd

opened by 943fansi 15

Add KeOps MMD detector
Add MMD detector using the KeOps (PyTorch) backend to further accelerate drift detection and scale up to larger datasets. This PR needs to be made compatible with the optional dependency management (incl. #538 and related).

This PR includes:

[x] MMD detector implementation using KeOps

[x] GaussianRBF kernel using KeOps

[x] Tests

[x] Docs

[x] Basic benchmarking example vs. PyTorch MMD

[x] Add a note to docs regarding lack of Windows support.

[x] Investigate segfault with MacOS, or drop support for now.

[x] Document sigma_mean vs. sigma_median and make foolproof.

[x] Update keops infer_sigma check.

[x] Update docstrings keops kernels to clarify various dims options + clarify within the forward pass.

[x] Clarify GPU requirements and prettify example.

[x] Document logic keops kernels more explicitly.

[x] Fully compatible tests with torch and tensorflow.

[x] Unit test _mmd2 -> check if results match that of the PyTorch implementation.

[x] Exception -> error type in keops test.

[x] Test sigma_mean for both "usual" (non-batch) and batch setting (unusual and should probably use the first batch entry since it corresponds to the original (x, y)).

Once this PR is merged, it will be followed up by a similar implementation for the Learned (Deep) Kernel detector.
opened by arnaudvl 14
what to do if ref and test data have different categories in chisquare?

I have a question that I think should have been taken into account in this library but I can't find the solution.

Currently if the reference data has a category feature that is different from that of the test data, we will get an error when we call the predict method in TabularDrift or ChiSquareDrift. I created categories_per_feature on the whole data but the way I split the data, one of the features of my reference data has categories from 0 to 11, and 0 to 12 for test data. The error I get is operands could not be broadcast together with shapes (13,) (12,) This error comes from chisquare function under the hood.

I think this is not a rare incident and it is probable that the reference data does not have all the categories of the test data for one or more features.

opened by AsiehH 14
WIP: Context aware drift detector
This PR implements the context-aware drift detector ContextAwareDrift based on @ojcobb and @arnaudvl's recent paper.

Note: This PR implements the code only. The docs and examples will form part of a subsequent PR into the context_drift branch.

Example(s)

An example notebook demonstrating a basic use case is available here: https://gist.github.com/ascillitoe/689f748f16962d5a0ebd49c8ea09a6c6

Outstanding tasks

[x] Tensorflow backend.

[x] Tests.

[x] Docs (methods and examples) - Delay until subsequent PR into context_drift.

[x] Improve docstrings (I've left these very sparse for now, and will update based on docs, in order to ensure notation and terms match).

[x] More detailed performance checks. See https://github.com/SeldonIO/alibi-detect/pull/454#issuecomment-1060599156 for discussion. @arnaudvl if we could run one of your heavier examples with both pytorch and tensorflow that should give us an idea of real-world performance of the TF backend.

[x] Check difference in GaussianRBF kernels.

[x] Think about whether to include update_ref option. If so need to include for c_ref too.
opened by ascillitoe 13
Add a conda install option for `alibi-detect`
A conda installation option could be very helpful. I have already started working on this, to add alibi-detect to conda-forge.

Conda-forge PR:

https://github.com/conda-forge/staged-recipes/pull/17582

Once the conda-forge PR is merged, you will be able to install the library with conda as follows:

conda install -c conda-forge alibi-detect

:bulb: I will push a PR to update the docs once the package is available on conda-forge.
opened by sugatoray 11

Can't install alibi-detect with poetry because of long deps resolving

With the following configuration

[tool.poetry]
name = "name"
version = "0.0.0"
description = ""
authors = [""]

[tool.poetry.dependencies]
python = ">=3.7"
alibi-detect = "^0.10.3"

[tool.poetry.dev-dependencies]

[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"

after running poetry install, the resolving never ends.

Using Python 3.10 and poetry 1.1.14, M1 Mac.

Does anyone have a similar problem?

Type: Maintenance

opened by smolendawid 10

Tensorflow requirement issue

When I import alibi_detect, I get the following error:

AttributeError: module 'tensorflow_core.keras.activations' has no attribute 'swish'

which is related to this issue: https://github.com/huggingface/transformers/issues/7333

They say to update to TensorFlow 2.2, ~~but tf 2.2 requires at least Python 3.7.~~

~~So alibi-detect is basically not compatible with Python 3.6.~~

opened by candalfigomoro 10

Saving VAE failed when using save_detector

Hi, I'm trying to learn how to use the library, and I'm trying to train and save a VAE as seen in the tutorial, I'm just executing the code that builds the model and saves it after training.

This is the error I receive:

Traceback (most recent call last):
  File "/{}/.local/lib/python3.10/site-packages/alibi_detect/saving/saving.py", line 71, in save_detector
    save_detector_legacy(detector, filepath)
  File "/{}/.local/lib/python3.10/site-packages/alibi_detect/saving/tensorflow/_saving.py", line 237, in save_detector_legacy
    save_tf_vae(detector, filepath)
  File "/{}/.local/lib/python3.10/site-packages/alibi_detect/saving/tensorflow/_saving.py", line 786, in save_tf_vae
    detector.vae.encoder.encoder_net.save(model_dir.joinpath('encoder_net.h5'))
  File "/{}/.local/lib/python3.10/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/{}/.local/lib/python3.10/site-packages/keras/engine/base_layer.py", line 745, in get_config
    raise NotImplementedError(textwrap.dedent(f"""
NotImplementedError: 
Layer ModuleWrapper has arguments ['self', 'module', 'method_name']
in `__init__` and therefore must override `get_config()`.

Example:

class CustomLayer(keras.layers.Layer):
    def __init__(self, arg1, arg2):
        super().__init__()
        self.arg1 = arg1
        self.arg2 = arg2

    def get_config(self):
        config = super().get_config()
        config.update({
            "arg1": self.arg1,
            "arg2": self.arg2,
        })
        return config

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/{}/Documents/test_alibi_detect/test_alibi_script2.py", line 70, in <module>
    save_detector(od, filepath, legacy = True)
  File "/{}/.local/lib/python3.10/site-packages/alibi_detect/saving/saving.py", line 77, in save_detector
    raise RuntimeError(f'Saving failed. The save directory {filepath} has been cleaned.') from error
RuntimeError: Saving failed. The save directory outlierVAE has been cleaned.

The code I executed:

import os
import logging
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
tf.keras.backend.clear_session()
from tensorflow.python.keras.layers import Conv2D, Conv2DTranspose, Dense, Layer, Reshape, InputLayer
from tqdm import tqdm

from alibi_detect.models.tensorflow import elbo
from alibi_detect.od import OutlierVAE
from alibi_detect.utils.fetching import fetch_detector
from alibi_detect.saving import save_detector, load_detector

logger = tf.get_logger()
logger.setLevel(logging.ERROR)

train, test = tf.keras.datasets.cifar10.load_data()
X_train, y_train = train
X_test, y_test = test

X_train = X_train.astype('float32') / 255
X_test = X_test.astype('float32') / 255
print(X_train.shape, y_train.shape, X_test.shape, y_test.shape)

filepath = './'  # change to directory where model is downloaded
detector_type = 'outlier'
dataset = 'cifar10'
detector_name = 'outlierVAE'
filepath = os.path.join(filepath, detector_name)

latent_dim = 1024

encoder_net = tf.keras.Sequential(
  [
      InputLayer(input_shape=(32, 32, 3)),
      Conv2D(64, 4, strides=2, padding='same', activation=tf.nn.relu),
      Conv2D(128, 4, strides=2, padding='same', activation=tf.nn.relu),
      Conv2D(512, 4, strides=2, padding='same', activation=tf.nn.relu)
  ])

decoder_net = tf.keras.Sequential(
  [
      InputLayer(input_shape=(latent_dim,)),
      Dense(4*4*128),
      Reshape(target_shape=(4, 4, 128)),
      Conv2DTranspose(256, 4, strides=2, padding='same', activation=tf.nn.relu),
      Conv2DTranspose(64, 4, strides=2, padding='same', activation=tf.nn.relu),
      Conv2DTranspose(3, 4, strides=2, padding='same', activation='sigmoid')
  ])

# initialize outlier detector
od = OutlierVAE(threshold=.015,  # threshold for outlier score
                score_type='mse',  # use MSE of reconstruction error for outlier detection
                encoder_net=encoder_net,  # can also pass VAE model instead
                decoder_net=decoder_net,  # of separate encoder and decoder
                latent_dim=latent_dim,
                samples=2)
# train
od.fit(X_train[:100],
        loss_fn=elbo,
        cov_elbo=dict(sim=.05),
        epochs=1,
        verbose=True)

# save the trained outlier detector
save_detector(od, filepath, legacy = True)

The installation is fresh via PIP, pip install alibi-detect[tensorflow]

opened by mbisan 0

[DOC] ClassifierDrift

In which data is the classifier drift trained? The documentation does not state it very clear.

Classifier-based drift detector. The classifier is trained on a fraction of the combined reference and test data and drift is detected on the remaining data. To use all the data to detect drift, a stratified cross-validation scheme can be chosen.

opened by cmougan 2

imdb reviews example: cannot serialize detector

Hello,

I am trying to follow the example here

However, I am facing an error:

save_detector(cd, filepath)

I get

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
File ~/envs/tmp-mlflow-whylogs/lib/python3.8/site-packages/alibi_detect/saving/saving.py:67, in save_detector(detector, filepath, legacy)
     66 if isinstance(detector, ConfigurableDetector) and not legacy:
---> 67     _save_detector_config(detector, filepath)
     69 # Otherwise, save via the previous meta and state_dict approach
     70 else:

File ~/envs/tmp-mlflow-whylogs/lib/python3.8/site-packages/alibi_detect/saving/saving.py:160, in _save_detector_config(detector, filepath)
    159 logger.info('Saving the preprocess_fn function.')
--> 160 preprocess_cfg = _save_preprocess_config(preprocess_fn, backend, cfg['input_shape'], filepath)
    161 cfg['preprocess_fn'] = preprocess_cfg

File ~/envs/tmp-mlflow-whylogs/lib/python3.8/site-packages/alibi_detect/saving/saving.py:282, in _save_preprocess_config(preprocess_fn, backend, input_shape, filepath)
    281 elif callable(v):
--> 282     src, _ = _serialize_object(v, filepath, local_path)
    283     kwargs.update({k: src})

File ~/envs/tmp-mlflow-whylogs/lib/python3.8/site-packages/alibi_detect/saving/saving.py:338, in _serialize_object(obj, base_path, local_path)
    337 with open(filepath.with_suffix('.dill'), 'wb') as f:
--> 338     dill.dump(obj, f)
    339 src = str(local_path.with_suffix('.dill'))

File ~/envs/tmp-mlflow-whylogs/lib/python3.8/site-packages/dill/_dill.py:235, in dump(obj, file, protocol, byref, fmode, recurse, **kwds)
    234 _kwds.update(dict(byref=byref, fmode=fmode, recurse=recurse))
--> 235 Pickler(file, protocol, **_kwds).dump(obj)
    236 return

File ~/envs/tmp-mlflow-whylogs/lib/python3.8/site-packages/dill/_dill.py:394, in Pickler.dump(self, obj)
    393 logger.trace_setup(self)
--> 394 StockPickler.dump(self, obj)

File /usr/lib/python3.8/pickle.py:487, in _Pickler.dump(self, obj)
    486     self.framer.start_framing()
--> 487 self.save(obj)
    488 self.write(STOP)

File ~/envs/tmp-mlflow-whylogs/lib/python3.8/site-packages/dill/_dill.py:388, in Pickler.save(self, obj, save_persistent_id)
    387     raise PicklingError(msg)
--> 388 StockPickler.save(self, obj, save_persistent_id)

File /usr/lib/python3.8/pickle.py:578, in _Pickler.save(self, obj, save_persistent_id)
    577 if reduce is not None:
--> 578     rv = reduce(self.proto)
    579 else:

File ~/envs/tmp-mlflow-whylogs/lib/python3.8/site-packages/keras/engine/training.py:367, in Model.__reduce__(self)
    364 if self.built:
    365     return (
    366         pickle_utils.deserialize_model_from_bytecode,
--> 367         (pickle_utils.serialize_model_as_bytecode(self),),
    368     )
    369 else:
    370     # SavedModel (and hence serialize_model_as_bytecode) only support
    371     # built models, but if the model is not built,
   (...)
    375     # the superclass hierarchy to get an implementation of __reduce__
    376     # that can pickle this Model as a plain Python object.

File ~/envs/tmp-mlflow-whylogs/lib/python3.8/site-packages/keras/saving/pickle_utils.py:73, in serialize_model_as_bytecode(model)
     72 except Exception as e:
---> 73     raise e
     74 else:

File ~/envs/tmp-mlflow-whylogs/lib/python3.8/site-packages/keras/saving/pickle_utils.py:69, in serialize_model_as_bytecode(model)
     68 filepath = os.path.join(temp_dir, "model.keras")
---> 69 saving_lib.save_model(model, filepath)
     70 with open(filepath, "rb") as f:

File ~/envs/tmp-mlflow-whylogs/lib/python3.8/site-packages/keras/saving/experimental/saving_lib.py:117, in save_model(model, filepath)
    115 _SAVING_V3_ENABLED.value = True
--> 117 serialized_model_dict = serialize_keras_object(model)
    118 config_json = json.dumps(serialized_model_dict)

File ~/envs/tmp-mlflow-whylogs/lib/python3.8/site-packages/keras/saving/experimental/serialization_lib.py:116, in serialize_keras_object(obj)
    112     registered_name = None
    113 return {
    114     "module": module,
    115     "class_name": class_name,
--> 116     "config": _get_class_or_fn_config(obj),
    117     "registered_name": registered_name,
    118 }

File ~/envs/tmp-mlflow-whylogs/lib/python3.8/site-packages/keras/saving/experimental/serialization_lib.py:141, in _get_class_or_fn_config(obj)
    137         raise TypeError(
    138             f"The `get_config()` method of {obj} should return "
    139             f"a dict. It returned: {config}"
    140         )
--> 141     return serialize_dict(config)
    142 else:

File ~/envs/tmp-mlflow-whylogs/lib/python3.8/site-packages/keras/saving/experimental/serialization_lib.py:151, in serialize_dict(obj)
    150 def serialize_dict(obj):
--> 151     return {key: serialize_keras_object(value) for key, value in obj.items()}

File ~/envs/tmp-mlflow-whylogs/lib/python3.8/site-packages/keras/saving/experimental/serialization_lib.py:151, in <dictcomp>(.0)
    150 def serialize_dict(obj):
--> 151     return {key: serialize_keras_object(value) for key, value in obj.items()}

File ~/envs/tmp-mlflow-whylogs/lib/python3.8/site-packages/keras/saving/experimental/serialization_lib.py:116, in serialize_keras_object(obj)
    112     registered_name = None
    113 return {
    114     "module": module,
    115     "class_name": class_name,
--> 116     "config": _get_class_or_fn_config(obj),
    117     "registered_name": registered_name,
    118 }

File ~/envs/tmp-mlflow-whylogs/lib/python3.8/site-packages/keras/saving/experimental/serialization_lib.py:143, in _get_class_or_fn_config(obj)
    142 else:
--> 143     raise TypeError(
    144         f"Cannot serialize object {obj} of type {type(obj)}. "
    145         "To be serializable, "
    146         "a class must implement the `get_config()` method."
    147     )

TypeError: Cannot serialize object {'input_ids': TensorShape([5, 100]), 'token_type_ids': TensorShape([5, 100]), 'attention_mask': TensorShape([5, 100])} of type <class 'transformers.tokenization_utils_base.BatchEncoding'>. To be serializable, a class must implement the `get_config()` method.

The above exception was the direct cause of the following exception:

RuntimeError                              Traceback (most recent call last)
Cell In[17], line 2
      1 filepath = 'my_path'  # change to directory where detector is saved
----> 2 save_detector(cd, filepath)

File ~/envs/tmp-mlflow-whylogs/lib/python3.8/site-packages/alibi_detect/saving/saving.py:77, in save_detector(detector, filepath, legacy)
     75     orig_files = set(filepath.iterdir())
     76     _cleanup_filepath(orig_files, filepath)
---> 77     raise RuntimeError(f'Saving failed. The save directory {filepath} has been cleaned.') from error
     79 logger.info('finished saving.')

RuntimeError: Saving failed. The save directory my_path has been cleaned.

System and packages

Ubuntu Python 3.8.15 alibi-detect==0.10.4 transformers==4.25.1

full packages:

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
absl-py==1.3.0
adal==1.2.7
aiofiles==22.1.0
aiohttp==3.8.3
aiokafka==0.8.0
aiosignal==1.3.1
alembic==1.8.1
alibi-detect==0.10.4
anyio==3.6.2
argcomplete==2.0.0
argon2-cffi==21.3.0
argon2-cffi-bindings==21.2.0
arrow==1.2.3
asttokens==2.2.1
astunparse==1.6.3
async-timeout==4.0.2
attrs==22.1.0
azure-common==1.1.28
azure-core==1.26.1
azure-graphrbac==0.61.1
azure-mgmt-authorization==3.0.0
azure-mgmt-containerregistry==10.0.0
azure-mgmt-core==1.3.2
azure-mgmt-keyvault==10.1.0
azure-mgmt-resource==21.2.1
azure-mgmt-storage==20.1.0
azureml-core==1.48.0
Babel==2.11.0
backcall==0.2.0
backports.tempfile==1.0
backports.weakref==1.0.post1
bcrypt==4.0.1
beautifulsoup4==4.11.1
bleach==5.0.1
boto3==1.26.24
botocore==1.29.24
Brotli==1.0.9
cachetools==5.2.0
catalogue==2.0.8
certifi==2022.9.24
cffi==1.15.1
charset-normalizer==2.1.1
click==8.1.3
cloudpickle==2.2.0
contextlib2==21.6.0
contourpy==1.0.6
cryptography==38.0.4
cycler==0.11.0
databricks-cli==0.17.3
debugpy==1.6.4
decorator==5.1.1
defusedxml==0.7.1
dill==0.3.6
distlib==0.3.6
docker==6.0.1
entrypoints==0.4
executing==1.2.0
fastapi==0.88.0
fastjsonschema==2.16.2
filelock==3.8.2
Flask==2.2.2
flatbuffers==22.12.6
fonttools==4.38.0
fqdn==1.5.1
frozenlist==1.3.3
gast==0.4.0
gevent==22.10.2
geventhttpclient==2.0.2
gitdb==4.0.10
GitPython==3.1.29
google-api-core==2.11.0
google-auth==2.15.0
google-auth-oauthlib==0.4.6
google-cloud-core==2.3.2
google-cloud-storage==2.7.0
google-crc32c==1.5.0
google-pasta==0.2.0
google-resumable-media==2.4.0
googleapis-common-protos==1.57.0
greenlet==2.0.1
grpcio==1.51.1
gunicorn==20.1.0
h11==0.14.0
h5py==3.7.0
huggingface-hub==0.11.1
humanfriendly==10.0
idna==3.4
imageio==2.22.4
importlib-metadata==5.1.0
importlib-resources==5.10.1
ipykernel==6.17.1
ipython==8.7.0
ipython-genutils==0.2.0
isodate==0.6.1
isoduration==20.11.0
itsdangerous==2.1.2
jedi==0.18.2
jeepney==0.8.0
Jinja2==3.1.2
jmespath==1.0.1
joblib==1.2.0
json5==0.9.10
jsonpickle==2.2.0
jsonpointer==2.3
jsonschema==4.17.3
jupyter-events==0.5.0
jupyter_client==7.4.8
jupyter_core==5.1.0
jupyter_server==2.0.0
jupyter_server_terminals==0.4.2
jupyterlab==3.5.1
jupyterlab-pygments==0.2.2
jupyterlab_server==2.16.3
kafka-python==2.0.2
keras==2.11.0
kiwisolver==1.4.4
knack==0.10.1
kubernetes==25.3.0
libclang==14.0.6
llvmlite==0.38.1
Mako==1.2.4
Markdown==3.4.1
MarkupSafe==2.1.1
marshmallow==3.19.0
matplotlib==3.6.2
matplotlib-inline==0.1.6
mistune==2.0.4
mlflow==2.0.1
mlserver==1.2.0
mlserver-mlflow==1.2.0
msal==1.20.0
msal-extensions==1.0.0
msrest==0.7.1
msrestazure==0.6.4
multidict==6.0.3
nbclassic==0.4.8
nbclient==0.7.2
nbconvert==7.2.6
nbformat==5.7.0
ndg-httpsclient==0.5.1
nest-asyncio==1.5.6
networkx==2.8.8
nlp==0.4.0
notebook==6.5.2
notebook_shim==0.2.2
numba==0.55.2
numpy==1.22.0
oauthlib==3.2.2
opencv-python==4.6.0.66
opt-einsum==3.3.0
orjson==3.8.3
packaging==21.3
pandas==1.5.2
pandocfilters==1.5.0
paramiko==2.12.0
parso==0.8.3
pathspec==0.10.2
pexpect==4.8.0
pickleshare==0.7.5
Pillow==9.3.0
pkginfo==1.9.2
pkgutil_resolve_name==1.3.10
platformdirs==2.6.0
portalocker==2.6.0
prometheus-client==0.15.0
prometheus-flask-exporter==0.21.0
prompt-toolkit==3.0.36
protobuf==3.19.6
psutil==5.9.4
ptyprocess==0.7.0
pure-eval==0.2.2
puremagic==1.14
py-grpc-prometheus==0.7.0
pyarrow==10.0.1
pyasn1==0.4.8
pyasn1-modules==0.2.8
pybars3==0.9.7
pycparser==2.21
pydantic==1.10.2
Pygments==2.13.0
PyJWT==2.6.0
PyMeta3==0.5.1
PyNaCl==1.5.0
pyOpenSSL==22.1.0
pyparsing==3.0.9
pyrsistent==0.19.2
pysftp==0.2.9
PySocks==1.7.1
python-dateutil==2.8.2
python-dotenv==0.21.0
python-json-logger==2.0.4
python-rapidjson==1.9
pytz==2022.6
PyWavelets==1.4.1
PyYAML==6.0
pyzmq==24.0.1
querystring-parser==1.2.4
regex==2022.10.31
requests==2.28.1
requests-auth-aws-sigv4==0.7
requests-oauthlib==1.3.1
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rsa==4.9
s3transfer==0.6.0
scikit-image==0.19.3
scikit-learn==1.1.3
scipy==1.9.3
seaborn==0.12.1
SecretStorage==3.3.3
Send2Trash==1.8.0
shap==0.41.0
six==1.16.0
slicer==0.0.7
smart-open==6.2.0
smmap==5.0.0
sniffio==1.3.0
soupsieve==2.3.2.post1
SQLAlchemy==1.4.44
sqlparse==0.4.3
stack-data==0.6.2
starlette==0.22.0
starlette-exporter==0.14.0
tabulate==0.9.0
tensorboard==2.11.0
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.1
tensorflow==2.11.0
tensorflow-estimator==2.11.0
tensorflow-io-gcs-filesystem==0.28.0
termcolor==2.1.1
terminado==0.17.1
threadpoolctl==3.1.0
tifffile==2022.10.10
tinycss2==1.2.1
tokenizers==0.13.2
toml==0.10.2
tomli==2.0.1
tornado==6.2
tqdm==4.64.1
traitlets==5.6.0
transformers==4.25.1
tritonclient==2.28.0
typing_extensions==4.4.0
uri-template==1.2.0
urllib3==1.26.13
uvicorn==0.20.0
uvloop==0.17.0
virtualenv==20.17.1
wcwidth==0.2.5
webcolors==1.12
webencodings==0.5.1
websocket-client==1.4.2
Werkzeug==2.2.2
whylabs-client==0.3.0
whylabs-datasketches==2.2.0b1
whylogs==0.7.10
whylogs-sketching==3.4.1.dev3
wrapt==1.14.1
xxhash==3.1.0
yarl==1.8.2
zipp==3.11.0
zope.event==4.5.0
zope.interface==5.5.2

opened by cristianmtr 1

Update tensorflow-probability requirement from <0.19.0,>=0.8.0 to >=0.8.0,<0.20.0
Updates the requirements on tensorflow-probability to permit the latest version.

Release notes

Sourced from tensorflow-probability's releases.

TensorFlow Probability 0.19.0

Release notes

This is the 0.19.0 release of TensorFlow Probability. It is tested and stable against TensorFlow version 2.11 and JAX 0.3.25 .

Change notes

[coming soon]

Huge thanks to all the contributors to this release!

[coming soon]

Commits

0759c57 Merge pull request #1660 from emilyfertig/r0.19

80e6f25 Add deprecation notice to log_cumsum_exp.

ca88e1d Merge pull request #1659 from emilyfertig/r0.19

9600b80 Revert "Add numpy/jax rewrite for cumulative_logsumexp."

17ddf7c Remove "dev" version suffix for release.

d47f171 Increment TF version requirement to 2.11.

6bba470 Clean up JAX random seed handling in Time Series example notebook.

e816859 FunMC: Add AIS/SMC.

6fc9a9e Fix Hager-Zhang linesearch to accept intervals with zero derivative for the r...

ee8fbbe Allow jnp.bfloat16 arrays to be correctly recognized as floats.

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

dependencies
opened by dependabot[bot] 0

Numpy version issues when importing load_detector from alibi_detect.saving

I am currently having issues importing the load detector due to a NumPy being None when importing. The class below is what is used to load in the detector and provide predictions on a dataset. The code is being run on a AWS Sagemaker SKLearn Container framework version 1.0-1 for Batch Transformation. The issue does not arise every time the container is run (About 25% of the time)

import sys
import warnings
warnings.filterwarnings("ignore", category=FutureWarning)
import os 
import numpy
import traceback
class DriftDetector:
    def __init__(self):
        self.is_init = False
    
    def load_detector(self, model_dir): 
        if self.is_init:
            return self.detector
        else:
            print(f"NUMPY VERSION....{numpy.__version__}")
            try:
                from alibi_detect.saving import load_detector
                self.detector = load_detector(model_dir)
                self.is_init = True
                print("MMD Drift Model Loaded ......")
                return self.detector
            except Exception as e:
                print(e)
                print(f"CHECKING NUMPY VERSION....{numpy.__version__}")
                traceback.print_tb(e.__traceback__)
    def detect(self, input_data, **_others):
        print(f"MMD Drift Model detecting ...... {_others}")
        import pandas as pd
        _detector = self.load_detector(_others['model_dir'])
        import contextlib
        with contextlib.suppress(ValueError):
            _result = _detector.predict(input_data)
            return pd.DataFrame([_result['data']])

The requirements.txt file for the container includes: alibi-detect[tensorflow]==0.10.4 markupsafe==2.0.1 werkzeug==2.0.3 importlib-metadata==5.0.0 smart-open==5.2.1

The output of the logs are:


grkz9wuob7-algo-1-ywaz1 | MMD Drift Model detecting ...... {'model_dir': '/opt/ml/model'}
grkz9wuob7-algo-1-ywaz1 | NUMPY VERSION....1.22.4
grkz9wuob7-algo-1-ywaz1 | 2022-11-30 19:27:44.111746: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
grkz9wuob7-algo-1-ywaz1 | 2022-11-30 19:27:44.111781: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
grkz9wuob7-algo-1-ywaz1 | Unable to compare versions for numpy>=1.17: need=1.17 found=None. This is unusual. Consider reinstalling numpy.
grkz9wuob7-algo-1-ywaz1 | CHECKING NUMPY VERSION....1.22.4
grkz9wuob7-algo-1-ywaz1 |   File "/opt/ml/code/utils/driftdetecting.py", line 19, in load_detector
grkz9wuob7-algo-1-ywaz1 |     from alibi_detect.saving import load_detector
grkz9wuob7-algo-1-ywaz1 |   File "/miniconda3/lib/python3.8/site-packages/alibi_detect/__init__.py", line 1, in <module>
grkz9wuob7-algo-1-ywaz1 |     from . import ad, cd, models, od, utils, saving
grkz9wuob7-algo-1-ywaz1 |   File "/miniconda3/lib/python3.8/site-packages/alibi_detect/ad/__init__.py", line 3, in <module>
grkz9wuob7-algo-1-ywaz1 |     AdversarialAE = import_optional('alibi_detect.ad.adversarialae', names=['AdversarialAE'])
grkz9wuob7-algo-1-ywaz1 |   File "/miniconda3/lib/python3.8/site-packages/alibi_detect/utils/missing_optional_dependency.py", line 101, in import_optional
grkz9wuob7-algo-1-ywaz1 |     module = import_module(module_name)
grkz9wuob7-algo-1-ywaz1 |   File "/miniconda3/lib/python3.8/importlib/__init__.py", line 127, in import_module
grkz9wuob7-algo-1-ywaz1 |     return _bootstrap._gcd_import(name[level:], package, level)
grkz9wuob7-algo-1-ywaz1 |   File "/miniconda3/lib/python3.8/site-packages/alibi_detect/ad/adversarialae.py", line 9, in <module>
grkz9wuob7-algo-1-ywaz1 |     from alibi_detect.models.tensorflow.autoencoder import AE
grkz9wuob7-algo-1-ywaz1 |   File "/miniconda3/lib/python3.8/site-packages/alibi_detect/models/tensorflow/__init__.py", line 8, in <module>
grkz9wuob7-algo-1-ywaz1 |     TransformerEmbedding = import_optional(
grkz9wuob7-algo-1-ywaz1 |   File "/miniconda3/lib/python3.8/site-packages/alibi_detect/utils/missing_optional_dependency.py", line 101, in import_optional
grkz9wuob7-algo-1-ywaz1 |     module = import_module(module_name)
grkz9wuob7-algo-1-ywaz1 |   File "/miniconda3/lib/python3.8/importlib/__init__.py", line 127, in import_module
grkz9wuob7-algo-1-ywaz1 |     return _bootstrap._gcd_import(name[level:], package, level)
grkz9wuob7-algo-1-ywaz1 |   File "/miniconda3/lib/python3.8/site-packages/alibi_detect/models/tensorflow/embedding.py", line 3, in <module>
grkz9wuob7-algo-1-ywaz1 |     from transformers import TFAutoModel, AutoConfig
grkz9wuob7-algo-1-ywaz1 |   File "/miniconda3/lib/python3.8/site-packages/transformers/__init__.py", line 30, in <module>
grkz9wuob7-algo-1-ywaz1 |     from . import dependency_versions_check
grkz9wuob7-algo-1-ywaz1 |   File "/miniconda3/lib/python3.8/site-packages/transformers/dependency_versions_check.py", line 41, in <module>
grkz9wuob7-algo-1-ywaz1 |     require_version_core(deps[pkg])
grkz9wuob7-algo-1-ywaz1 |   File "/miniconda3/lib/python3.8/site-packages/transformers/utils/versions.py", line 123, in require_version_core
grkz9wuob7-algo-1-ywaz1 |     return require_version(requirement, hint)
grkz9wuob7-algo-1-ywaz1 |   File "/miniconda3/lib/python3.8/site-packages/transformers/utils/versions.py", line 117, in require_version
grkz9wuob7-algo-1-ywaz1 |     _compare_versions(op, got_ver, want_ver, requirement, pkg, hint)
grkz9wuob7-algo-1-ywaz1 |   File "/miniconda3/lib/python3.8/site-packages/transformers/utils/versions.py", line 45, in _compare_versions
grkz9wuob7-algo-1-ywaz1 |     raise ValueError(

As you can see by checking the NumPy version, version 1.22.4 is installed but is not being picked up when loading in the detector.

opened by cfolan17 3

Follow-up's to #681
Tracker for two items identified in #681.

Item 1 (https://github.com/SeldonIO/alibi-detect/pull/681#discussion_r1030713876)

Shallow copies are currently used to set and handle config, for example:

cfg = self.config.copy()

This is OK if all any dictionary items are replaced, but might lead to unexpected behaviour if any items are mutated. We should audit all instances and replace with deepcopy where necessary.

Item 2 (https://github.com/SeldonIO/alibi-detect/pull/681#discussion_r1030715477)

We should replace kernel_a: Union[nn.Module, str] = 'rbf''s with Literal's in the kernel submodules, since rbf is the only permitted string value.
Type: Maintenance
opened by ascillitoe 0

Releases(v0.10.4)

v0.10.4(Oct 21, 2022)
v0.10.4 (2022-10-21)

Full Changelog

Fixed

Fixed an incorrect default value for the alternative kwarg in the FETDrift detector (#661).

Fixed an issue with ClassifierDrift returning incorrect prediction probabilities when train_size given (#662).

Source code(tar.gz)
Source code(zip)
v0.10.3(Aug 17, 2022)
v0.10.3 (2022-08-17)

Full Changelog

Fixed

Fix to allow config.toml files to be loaded when the [meta] field is not present (#591).

Source code(tar.gz)
Source code(zip)
v0.10.2(Aug 16, 2022)
v0.10.2 (2022-08-16)

Full Changelog

Fixed

Fixed a bug in the MMDDrift detector with pytorch backend, where the kernel attribute was not sent to the selected device (#587).

Development

Code Coverage added (#584).

Source code(tar.gz)
Source code(zip)
v0.10.1(Aug 10, 2022)
v0.10.1 (2022-08-10)

Full Changelog

Fixed

Corrected a missing optional dependency error when tensorflow was installed without tensorflow-probability (#580).

Development

An upper version bound has been added for torch (<1.13.0) (#575).

Source code(tar.gz)
Source code(zip)
v0.10.0(Jul 26, 2022)
v0.10.0 (2022-07-26)

Full Changelog

Added

New feature Drift detectors save/load functionality has been significantly reworked. All offline and online drift detectors (tensorflow backend only) can now be saved and loaded via config.toml files, allowing for more flexibility. Config files are also validated with pydantic. See the documentation for more info (#516).

New feature Option to use out-of-bag predictions when using a RandomForestClassifier with ClassifierDrift (#426).

Python 3.10 support. Note that PyTorch at the time of writing doesn't support Python 3.10 on Windows (#485).

Fixed

Fixed a bug in the TensorFlow trainer which occured when the data was a minibatch of size 2 (#492).

Changed

TensorFlow is now an optional dependency. Error messages for incorrect use of detectors that are dependent on missing optional dependencies have been improved to include installation instructions and be more informative (#537).

The optional dependency work has resulted in some imports being reorganised. The original imports will still work as long as the relevant optional dependencies are installed (#538).

from alibi_detect.utils.tensorflow.kernels import DeepKernel -> from alibi_detect.utils.tensorflow import DeepKernel

from alibi_detect.utils.tensorflow.prediction import predict_batch -> from alibi_detect.utils.tensorflow import predict_batch

from alibi_detect.utils.pytorch.data import TorchDataset -> from alibi_detect.utils.pytorch import TorchDataset

from alibi_detect.models.pytorch.trainer import trainer -> from alibi_detect.models.pytorch import trainer

from alibi_detect.models.tensorflow.resnet import scale_by_instance -> from alibi_detect.models.tensorflow import scale_by_instance

from alibi_detect.models.tensorflow.resnet import scale_by_instance -> from alibi_detect.models.tensorflow import scale_by_instance

from alibi_detect.utils.pytorch.kernels import DeepKernel -> from alibi_detect.utils.pytorch import DeepKernel

from alibi_detect.models.tensorflow.autoencoder import eucl_cosim_features -> from alibi_detect.models.tensorflow import eucl_cosim_features

from alibi_detect.utils.tensorflow.prediction import predict_batch -> from alibi_detect.utils.tensorflow import predict_batch

from alibi_detect.models.tensorflow.losses import elbo -> from alibi_detect.models.tensorflow import elbo

from alibi_detect.models import PixelCNN -> from alibi_detect.models.tensorflow import PixelCNN

from alibi_detect.utils.tensorflow.data import TFDataset -> from alibi_detect.utils.tensorflow import TFDataset

from alibi_detect.utils.pytorch.data import TorchDataset -> from alibi_detect.utils.pytorch import TorchDataset

The maximum tensorflow version has been bumped from 2.8 to 2.9 (#508).

breaking change The detector_type field in the detector.meta dictionary now indicates whether a detector is a 'drift', 'outlier' or 'adversarial' detector. Its previous meaning, whether a detector is online or offline, is now covered by the online field (#564).

Development

Added MissingDependency class and import_optional for protecting objects that are dependent on optional dependencies (#537).

Added BackendValidator to factor out similar logic across detectors with backends (#538).

Added missing CI test for ClassifierDrift with sklearn backend (#523).

Fixed typing for ContextMMDDrift pytorch backend with numpy>=1.22 (#520).

Drift detectors with backends refactored to perform distance threshold computation in score instead of predict (#489).

Factored out PyTorch device setting to utils.pytorch.misc.get_device() (#503). Thanks to @kuutsav!

Added utils._random submodule and pytest-randomly to manage determinism in CI build tests (#496).

From this release onwards we exclude the directories doc/ and examples/ from the source distribution (by adding prune directives in MANIFEST.in). This results in considerably smaller file sizes for the source distribution.

mypy has been updated to ~=0.900 which requires additional development dependencies for type stubs, currently only types-requests and types-toml have been necessary to add to requirements/dev.txt.

Source code(tar.gz)
Source code(zip)
v0.9.1(Apr 1, 2022)
v0.9.1 (2022-04-01)

Full Changelog

Fixed

Fixed an issue whereby simply importing the library in any capacity caused tensorflow to occupy all available GPU memory. This was due to the instantiation of tf.keras.Model objects within a class definition (GaussianRBF objects within the DeepKernel class).

Source code(tar.gz)
Source code(zip)
v0.9.0(Mar 17, 2022)
v0.9.0 (2022-03-17)

Full Changelog

Added

Added the ContextMMDDrift detector. The context-aware maximum mean discrepancy drift detector (Cobb and Van Looveren, 2022) is a kernel based method for detecting drift in a manner that can take relevant context into account.

Fixed

Fixed an issue experienced when the Model uncertainty based drift detection example is run on GPU's (#445).

Fixed an issue with the Text drift detection on IMDB example to allow PyTorch to be used (#438).

Development

The maximum tensorflow version has been bumped from 2.7 to 2.8 (#444).

Source code(tar.gz)
Source code(zip)
v0.8.1(Jan 18, 2022)
v0.8.1 (2022-01-18)

Full Changelog

Added

New feature ClassifierDrift now supports sklearn models (#414). See this example.

Changed

Python 3.6 has been deprecated from the supported versions as it has reached end-of-life.

Fixed

The SpectralResidual detector now uses padding to prevent spikes occuring at the beginning and end of scores (#396).

The handling of url's in the dataset and model fetching methods has been modified to fix behaviour on Windows platforms.

Development

numpy typing has been updated to be compatible with numpy 1.22 (#403). This is a prerequisite for upgrading to tensorflow 2.7.

The Alibi Detect CI tests now include Windows and MacOS platforms (#423).

The maximum tensorflow version has been bumped from 2.6 to 2.7 (#377).

Source code(tar.gz)
Source code(zip)
v0.8.0(Dec 9, 2021)
v0.8.0 (2021-12-09)

Full Changelog

Added

Offline and online versions of Fisher's Exact Test detector for supervised drift detection on binary data: from alibi_detect.cd import FETDrift, FETDriftOnline.

Offline and online versions of Cramér-von Mises detector for supervised drift detection on continuous data: from alibi_detect.cd import CVMDrift, CVMDriftOnline.

Offline supervised drift detection example on the penguin classification dataset.

Changed

Refactored online detectors to separate updating of state (#371).

Update tensorflow lower bound to 2.2 due to minimum requirements from transformers.

Fixed

Fixed incorrect kwarg name in utils.tensorflow.distance.permed_lsdd function (#399).

Development

Updated sphinx for documentation building to >=4.2.0.

Added a CITATIONS.cff file for consistent citing of the library.

CI actions are now not triggered on draft PRs (apart from a readthedoc build).

Removed dependency on nbsphinx_link and moved examples under doc/source/examples with symlinks from the top-level examples directory.

Source code(tar.gz)
Source code(zip)
v0.7.3(Oct 29, 2021)
v0.7.3 (2021-10-29)

Full Changelog

Added

DeepKernel is allowed without the kernel_b component, giving a kernel consisting of only a deep kernel component (kernel_a).

Documentation layout refreshed, and a new "Background to drift detection" added.

Fixed

Model fetching methods now correctly handle nested filepaths.

For backward compatibility, fetch and load methods now attept to fetch/load dill files, but fall back to pickle files.

Prevent dill from extending pickle dispatch table. This prevents undesirable behaviour if using pickle/joblib without dill imported later on (see #326).

For consistency between save_detector and load_detector, fetch_detector will no longer append detector_name to filepath.

Source code(tar.gz)
Source code(zip)
v0.7.2(Aug 17, 2021)
v0.7.2 (2021-08-17)

Full Changelog

Added

Learned kernels drift detector with TensorFlow and PyTorch support: from alibi_detect.cd import LearnedKernelDrift

Spot-the-diff drift detector with TensorFlow and PyTorch support: from alibi_detect.cd import SpotTheDiffDrift

Online drift detection example on medical imaging data: https://github.com/SeldonIO/alibi-detect/blob/master/examples/cd_online_camelyon.ipynb

Source code(tar.gz)
Source code(zip)
v0.7.1(Jul 22, 2021)
v0.7.1 (2021-07-22)

Full Changelog

Added

Extend allowed input type for drift detectors to include List[Any] with additional graph and text data examples.

Allow custom preprocessing steps within alibi_detect.utils.pytorch.prediction.predict_batch and alibi_detect.utils.tensorflow.prediction.predict_batch. This makes it possible to take List[Any] as input and combine instances in the list into batches of data in the right format for the model.

Removed

PCA preprocessing step for drift detectors.

Fixed

Improve numerical stability LSDD detectors (offline and online) to avoid overflow/underflow caused by higher dimensionality of the input data.

Spectral Residual outlier detector test.

Source code(tar.gz)
Source code(zip)
v0.7.0(Jun 7, 2021)
v0.7.0 (2021-06-07)

Full Changelog

Added

Least squares density difference drift detector from alibi_detect.cd import LSDDDrift with TensorFlow and PyTorch support.

Online versions of the MMD and LSDD drift detectors: from alibi_detect.cd import MMDDriftOnline, LSDDDriftOnline with TensorFlow and PyTorch support.

Enable Python 3.9 support.

Fixed

Hidden layer output as preprocessing step for drift detectors for internal layers with higher dimensional shape, e.g. (B, C, H, W).

Source code(tar.gz)
Source code(zip)
v0.6.2(May 6, 2021)
v0.6.2 (2021-05-06)

Full Changelog

Fixed

alibi-detect compatibility with transformers>=4.0.0

update slack link to point to alibi-detect channel

Source code(tar.gz)
Source code(zip)
v0.6.1(Apr 26, 2021)
v0.6.1 (2021-04-26)

Full Changelog

Added

Classification and regression model uncertainty drift detectors for both PyTorch and TensorFlow models: from alibi_detect.cd import ClassifierUncertaintyDrift, RegressorUncertaintyDrift.

Return p-values for ClassifierDrift detectors using either a KS test on the classifier's probabilities or logits. The model predictions can also be binarised and a binomial test can be applied.

Allow unseen categories in the test batches for the categorical and tabular drift detectors: from alibi_detect.cd import ChiSquareDrift, TabularDrift.

Source code(tar.gz)
Source code(zip)
v0.6.0(Apr 12, 2021)
v0.6.0 (2021-04-12)

Full Changelog

Added

Flexible backend support (TensorFlow and PyTorch) for drift detectors MMDDrift and ClassifierDrift as well as support for both frameworks for preprocessing steps (from alibi_detect.cd.tensorflow import HiddenOutput, preprocess_drift and from alibi_detect.models.tensorflow import TransformerEmbedding, replace tensorflow with pytorch for PyTorch support) and various utility functions (kernels and distance metrics) under alibi_detect.utils.tensorflow and alibi_detect.utils.pytorch.

Significantly faster implementation MMDDrift detector leveraging both GPU implementations in TensorFlow and PyTorch as well as making efficient use of the cached kernel matrix for the permutation tests.

Change test for ChiSquareDrift from goodness-of-fit of the observed data against the empirical distribution of the reference data to a test for homogeneity which does not bias p-values as much to extremes.

Include NumpyEncoder in library to facilitate json serialization.

Removed

As part of the introduction of flexible backends for various drift detectors, dask is no longer supported for the MMDDrift detector and distance computations.

Fixed

Update RTD theme version due to rendering bug.

Bug when using TabularDrift with categorical features and continuous numerical features. Incorrect indexing of categorical columns was performed.

Development

Pin pystan version to working release with prophet.

Source code(tar.gz)
Source code(zip)
v0.5.1(Mar 5, 2021)
v0.5.1 (2021-03-05)

Full Changelog

This is a bug fix release.

Fixed

The order of the reference and test dataset for the TabularDrift and ChiSquareDrift was reversed leading to incorrect test statistics

The implementation of TabularDrift and ChiSquareDrift were not accounting for the different sample sizes between reference and test datasets leading to incorrect test statistics

Bumped required scipy version to 1.3.0 as older versions were missing the alternative keyword argument for ks_2samp function

Source code(tar.gz)
Source code(zip)
v0.5.0(Feb 18, 2021)
v0.5.0 (2021-02-18)

Full Changelog

Added

Chi-square drift detector for categorical data: alibi_detect.cd.chisquare.ChiSquareDrift

Mixed-type tabular data drift detector: alibi_detect.cd.tabular.TabularDrift

Classifier-based drift detector: alibi_detect.cd.classifier.ClassifierDrift

Removed

DataTracker utility

Development

Docs build improvements, dependabot integration, daily build cronjob

Source code(tar.gz)
Source code(zip)
v0.4.4(Dec 23, 2020)
v0.4.4 (2020-12-23)

Full Changelog

Added

Remove integrations directory

Extend return dict drift detector

Update saving/loading functionality of the drift detectors. This leads to a breaking change for load_detector for the KSDrift and MMDDrift detectors.

Source code(tar.gz)
Source code(zip)
v0.4.3(Oct 8, 2020)
v0.4.3 (2020-10-08)

Full Changelog

Added

Make Prophet an optional dependency

Extend what is returned by the drift detectors to raw scores

Add licenses from dependencies

Source code(tar.gz)
Source code(zip)
v0.4.2(Sep 9, 2020)
v0.4.2 (2020-09-09)

Full Changelog

Added

Text drift detector functionality for KS and MMD drift detectors

Add embedding extraction functionality for pretrained HuggingFace transformers models (alibi_detect.models.embedding)

Add Python 3.8 support

Source code(tar.gz)
Source code(zip)
v0.4.1(May 12, 2020)
v0.4.1 (2020-05-12)

Full Changelog

Added

Likelihood ratio outlier detector (alibi_detect.od.llr.LLR) with image and genome dataset examples

Add genome dataset (alibi_detect.datasets.fetch_genome)

Add PixelCNN++ model (alibi_detect.models.pixelcnn.PixelCNN)

Source code(tar.gz)
Source code(zip)
v0.4.0(Apr 2, 2020)
v0.4.0 (2020-04-02)

Full Changelog

Added

Kolmogorov-Smirnov drift detector (alibi_detect.cd.ks.KSDrift)

Maximum Mean Discrepancy drift detector (alibi_detect.cd.mmd.MMDDrift)

Source code(tar.gz)
Source code(zip)
v0.3.1(Feb 26, 2020)
v0.3.1 (2020-02-26)

Full Changelog

Added

Adversarial autoencoder detection method (offline method, alibi_detect.ad.adversarialae.AdversarialAE)

Add pretrained adversarial and outlier detectors to Google Cloud Bucket and include fetch functionality

Add data/concept drift dataset (CIFAR-10-C) to Google Cloud Bucket and include fetch functionality

Update VAE loss function and log var layer

Fix tests for Prophet outlier detector on Python 3.6

Add batch sizes for all detectors

Source code(tar.gz)
Source code(zip)
v0.3.0(Jan 17, 2020)
v0.3.0 (2020-01-17)

Full Changelog

Added

Multivariate time series outlier detection method OutlierSeq2Seq (offline method, alibi_detect.od.seq2seq.OutlierSeq2Seq)

ECG and synthetic data examples for OutlierSeq2Seq detector

Auto-Encoder outlier detector (offline method, alibi_detect.od.ae.OutlierAE)

Including tabular and categorical perturbation functions (alibi_detect.utils.perturbation)

Source code(tar.gz)
Source code(zip)
v0.2.0(Dec 6, 2019)
v0.2.0 (2019-12-06)

Full Changelog

Added

Univariate time series outlier detection methods: Prophet (offline method, alibi_detect.od.prophet.OutlierProphet) and Spectral Residual (online method, alibi_detect.od.sr.SpectralResidual)

Function for fetching Numenta Anomaly Benchmark time series data (alibi_detect.datasets.fetch_nab)

Perturbation function for time series data (alibi_detect.utils.perturbation.inject_outlier_ts)

Roadmap

Source code(tar.gz)
Source code(zip)
v0.1.0(Nov 19, 2019)
Change Log

v0.1.0 (2019-11-19)

Added

Isolation Forest (Outlier Detection)

Mahalanobis Distance (Outlier Detection)

Variational Auto-Encoder (VAE, Outlier Detection)

Auto-Encoding Gaussian Mixture Model (AEGMM, Outlier Detection)

Variational Auto-Encoding Gaussian Mixture Model (VAEGMM, Outlier Detection)

Adversarial Variational Auto-Encoder (Adversarial Detection)

Source code(tar.gz)
Source code(zip)

Algorithms for outlier, adversarial and drift detection

Related tags

Overview

Table of Contents

Installation and Usage

Supported Algorithms

Outlier Detection

Adversarial Detection

Drift Detection

TensorFlow and PyTorch support

Built-in preprocessing steps

Reference List

Outlier Detection

Adversarial Detection

Drift Detection

Datasets

Sequential Data and Time Series

Images

Tabular

Models

Integrations

Citations

Comments

Summary of PR

Details

Ounstanding decisions

Post PR TODO's (to be consolidated into issues)

I am getting the following error when trying to execute code (in [10] section "Interpretable Drift Detection on the Wine Quality Dataset"):

RuntimeError: expected scalar type Long but found Int

Version 0.56.0

Example(s)

Outstanding tasks

TensorFlow Probability 0.19.0

Release notes

Change notes

Huge thanks to all the contributors to this release!

Item 1 (https://github.com/SeldonIO/alibi-detect/pull/681#discussion_r1030713876)

Item 2 (https://github.com/SeldonIO/alibi-detect/pull/681#discussion_r1030715477)

Releases(v0.10.4)

v0.10.4(Oct 21, 2022)

v0.10.4 (2022-10-21)

Fixed

v0.10.3(Aug 17, 2022)

v0.10.3 (2022-08-17)

Fixed

v0.10.2(Aug 16, 2022)

v0.10.2 (2022-08-16)

Fixed

Development

v0.10.1(Aug 10, 2022)

v0.10.1 (2022-08-10)

Fixed

Development

v0.10.0(Jul 26, 2022)

v0.10.0 (2022-07-26)

Added

Fixed

Changed

Development

v0.9.1(Apr 1, 2022)

v0.9.1 (2022-04-01)

Fixed

v0.9.0(Mar 17, 2022)

v0.9.0 (2022-03-17)

Added

Fixed

Development

v0.8.1(Jan 18, 2022)

v0.8.1 (2022-01-18)

Added

Changed

Fixed

Development

v0.8.0(Dec 9, 2021)

v0.8.0 (2021-12-09)

Added

Changed

Fixed

Development

v0.7.3(Oct 29, 2021)