Streaming over lightweight data transformations

Overview

slide

PyPI version Build Status Codecoverage Codacy Badge License DOI

Description

Data augmentation libarary for Deep Learning, which supports images, segmentation masks, labels and keypoints. Furthermore, SOLT is fast and has OpenCV in its backend. Full auto-generated docs and examples are available here: https://mipt-oulu.github.io/solt/.

Features

  • Support of Images, masks and keypoints for all the transforms (including multiple items at the time)
  • Fast and PyTorch-integrated
  • Convenient and flexible serialization API
  • Excellent documentation
  • Easy to extend
  • 100% Code coverage

Examples

Images: Cats Images + Keypoints: Cats Medical Images + Binary Masks: Brain MRI Medical Images + Multiclass Masks Knee MRI

E.g. the last row is generated using the following transforms stream.

stream = solt.Stream([
    slt.Rotate(angle_range=(-20, 20), p=1, padding='r'),
    slt.Crop((256, 256)),
    solt.SelectiveStream([
        slt.GammaCorrection(gamma_range=0.5, p=1),
        slt.Noise(gain_range=0.1, p=1),
        slt.Blur()    
    ], n=3)
])

img_aug, mask_aug = stream({'image': img, 'mask': mask})

If you want to visualize the results, you need to modify the execution of the transforms:

img_aug, mask_aug = stream({'image': img, 'mask': mask}, return_torch=False).data

Installation

The most recent version is available in pip:

pip install solt

You can fetch the most fresh changes from this repository:

pip install git+https://github.com/MIPT-Oulu/solt

Benchmark

We propose a fair benchmark based on the refactored version of the one proposed by albumentations team, but here, we also convert the results into a PyTorch tensor and do the ImageNet normalization. The following numbers support a realistic and honest comparison between the libraries (number of images per second, the higher - the better):

albumentations
0.4.3
torchvision (Pillow-SIMD backend)
0.5.0
augmentor
0.2.8
solt
0.1.9
HorizontalFlip 2253 2549 2561 3530
VerticalFlip 2380 2557 2572 3740
RotateAny 1479 1389 670 2070
Crop224 2566 1966 1981 4281
Crop128 5467 5738 5720 7186
Crop64 9285 9112 9049 10345
Crop32 11979 10550 10607 12348
Pad300 1642 109 - 2631
VHFlipRotateCrop 1574 1334 616 1889
HFlipCrop 2391 1943 1917 3572

Python and library versions: Python 3.7.0 (default, Oct 9 2018, 10:31:47) [GCC 7.3.0], numpy 1.18.1, pillow-simd 7.0.0.post3, opencv-python 4.2.0.32, scikit-image 0.16.2, scipy 1.4.1.

The code was run on AMD Threadripper 1900. Please find the details about the benchmark here.

How to contribute

Follow the guidelines described here.

Author

Aleksei Tiulpin, Research Unit of Medical Imaging, Physics and Technology, University of Oulu, Finalnd.

How to cite

If you use SOLT and cite it in your research, please, don't hesitate to sent an email to Aleksei Tiulpin. All the papers that use SOLT are listed here.

@misc{solt2019,
  author       = {Aleksei Tiulpin},
  title        = {SOLT: Streaming over Lightweight Transformations},
  month        = jul,
  year         = 2019,
  version      = {v0.1.9},
  doi          = {10.5281/zenodo.3702819},
  url          = {https://doi.org/10.5281/zenodo.3702819}
}
Comments
  • Crashed without clear error message when tranforming multiple images with different shapes

    Crashed without clear error message when tranforming multiple images with different shapes

    Code to reproduce:

    import solt
    import solt.transforms as slt
    import numpy as np
    
    
    if __name__ == "__main__":
        trf = solt.Stream([slt.Resize(resize_to=(50, 50)),
                           slt.Flip()])
        img1 = np.ones((100, 100, 3), dtype=np.int32)
        img2 = np.ones((110, 110, 3), dtype=np.int32)
        trf_img = trf({'images': (img1, img2)})
        print('Done')
    

    Adding a resize transformation at the beginning doesn't help. A clear error message is needed for this case.

    opened by hoanghng 4
  • NameError: name 'ktps' is not defined

    NameError: name 'ktps' is not defined

    While testing one of the notebook algorithm, this error occured: kpts = None for annotation_fname in glob.glob(os.path.join('Data', 'helen_annotations', '*.txt')): with open(annotation_fname) as f: if f.readline()[:-1] == fname.split('.')[0]: ktps = [] for l in f: tmp = l.split() ktps.append([float(tmp[0]), float(tmp[2])]) break kpts = np.array(ktps)

    Specific Error:

    NameError Traceback (most recent call last) in 8 ktps.append([float(tmp[0]), float(tmp[2])]) 9 break ---> 10 kpts = np.array(ktps)

    NameError: name 'ktps' is not defined

    opened by tobimichigan 3
  • Use pytorch for 3D+ data

    Use pytorch for 3D+ data

    Proof of concept.

    • Switched to pytorch for 3D+ in Flip, Crop and Pad
    • Flip now accepts multiple axes, but not axis=-1. Also, the axis order has been switched from opencv to numpy
    • Minor cleanup to the tests
    opened by soupault 2
  • nd follow-up

    nd follow-up

    • Removed height and width concepts where possible
    • Switched from shape checking decorator to function
    • Added shape validation also for masks and keypoints
    • Splitted utils into checks and serial
    opened by soupault 2
  • Bump opencv-python-headless from 4.1.2.30 to 4.2.0.32 in /ci

    Bump opencv-python-headless from 4.1.2.30 to 4.2.0.32 in /ci

    Bumps opencv-python-headless from 4.1.2.30 to 4.2.0.32.

    Release notes

    Sourced from opencv-python-headless's releases.

    4.2.0.32

    OpenCV version 4.2.0.

    Changes:

    • macOS environment updated from xcode8.3 to xcode 9.4
    • macOS uses now Qt 5 instead of Qt 4
    • Nasm version updated to Docker containers
    • multibuild updated

    Fixes:

    • don't use deprecated brew tap-pin, instead refer to the full package name when installing #267
    • replace get_config_var() with get_config_vars() in setup.py #274
    • add workaround for DLL errors in Windows Server #264
    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 1
  • Fix the documentation bug

    Fix the documentation bug

    https://github.com/MIPT-Oulu/solt/blob/485bfb0d471134f75e09d06aa5e9ce4b57c0e13c/solt/transforms/_transforms.py#L1018 tells something about noise. Should talk about the intensities instead.

    opened by lext 1
  • Compatibility issues with latest pytorch

    Compatibility issues with latest pytorch

    Works fine with 1.1, but not 1.5. Traceback:

        as_dict=as_dict, scale_keypoints=scale_keypoints, normalize=normalize, mean=mean, std=std,
      File ".../site-packages/solt/core/_data.py", line 273, in to_torch
        img.sub_(mean)
    RuntimeError: output with backend CPU and dtype Byte doesn't match the desired backend CPU and dtype Float
    Process finished with exit code 1
    
    opened by soupault 1
  • Can't run unit tests locally on MacOS

    Can't run unit tests locally on MacOS

    ๐Ÿ› Bug

    It is not possible to run unit tests locally on MacOS X (10.12).

    To Reproduce

    Steps to reproduce the behavior:

    1. Create a conda environment and install all of the packages as suggested here.

    2. Run the following code: pytest tests

    3. The unit tests fail and generate an error:

    ImportError while importing test module '/Users/melekhi1/projects/solt/tests/test_data_core.py'.
    Hint: make sure your test modules/packages have valid Python names.
    Traceback:
    tests/test_data_core.py:4: in <module>
        import cv2
    ../../anaconda2/envs/solt_test_env/lib/python3.8/site-packages/cv2/__init__.py:3: in <module>
        from .cv2 import *
    E   ImportError: dlopen(/Users/melekhi1/anaconda2/envs/solt_test_env/lib/python3.8/site-packages/cv2/cv2.cpython-38-darwin.so, 2): Symbol not found: _inflateValidate
    E     Referenced from: /Users/melekhi1/anaconda2/envs/solt_test_env/lib/python3.8/site-packages/cv2/.dylibs/libpng16.16.dylib (which was built for Mac OS X 10.13)
    E     Expected in: /usr/lib/libz.1.dylib
    E    in /Users/melekhi1/anaconda2/envs/solt_test_env/lib/python3.8/site-packages/cv2/.dylibs/libpng16.16.dylib
    

    Expected behavior

    The unit tests should pass

    =============================== test session starts ================================
    platform darwin -- Python 3.8.2, pytest-3.6.4, py-1.8.1, pluggy-0.7.1
    rootdir: /Users/melekhi1/projects/solt, inifile:
    plugins: pep8-1.0.6, flake8-1.0.2, cov-2.6.0
    collected 701 items
    
    tests/test_base_transforms.py .............................................. [  6%]
    ............................................................................ [ 17%]
    .....                                                                        [ 18%]
    tests/test_data_core.py .................................................... [ 25%]
    ............................................................................ [ 36%]
    ............................................................................ [ 47%]
    ..................................................................           [ 56%]
    tests/test_transforms.py ................................................... [ 63%]
    ............................................................................ [ 74%]
    ............................................................................ [ 85%]
    ............................................................................ [ 96%]
    .........                                                                    [ 97%]
    tests/test_utils.py ................                                         [100%]
    
    ============================ 701 passed in 6.79 seconds ============================
    

    Suggestions

    It seems the bug is related to the latest version of opencv-python-headless (4.2.0.32) package. Apparently, its previous version, 4.1.2.30, could solve the issue at least for this particular platform (MacOS 10.12). It would be great to test this solution on more recent MacOS.

    opened by imelekhov 1
  • Incorrect serialization of a cropping transform

    Incorrect serialization of a cropping transform

    The attribute crop_size needs to be renamed into crop_to to match the constructor. Otherwise, the serialization does not work well.

    https://github.com/MIPT-Oulu/solt/blob/770e397884bcafe80a11723c229e275c1c1f8b5a/solt/transforms/_transforms.py#L715

    bug 
    opened by lext 1
  • Fair benchmark with the other libraries

    Fair benchmark with the other libraries

    the augbench package needs further development and its results need to be reported in the README.

    the benchmark needs to cover random transformations rather than the static ones. It is important to make comparison for three cases: image, image mask, and image+10 masks (instance segmentation task).

    opened by lext 1
  • Brightness needs to support percentages of the mean intensity

    Brightness needs to support percentages of the mean intensity

    The use case for this is that for some images it might be better to rely on their mean intensity and increase their intensity by n% of the mean value.

    enhancement 
    opened by lext 0
  • Allow to randomly select random number of augmentations in Selective stream

    Allow to randomly select random number of augmentations in Selective stream

    Currently, selective stream takes a parameter that tells how many transforms to select from the list of transformations. I was recently suggested to make it a range and allow to sample random number of augmentations out of this range.

    feature request 
    opened by lext 0
Releases(v0.1.9)
  • v0.1.9(Mar 10, 2020)

    • [x] SOLT is now PyTorch Native
    • [x] All the transform subtract the mean and std by default
    • [x] Implemented a proper serialization and deserialization
    • [x] Allowed to use dict instead of a data container
    • [x] Introduced shorter transform names
    • [x] Implemented a fair benchmark to compare with other libraries
    • [x] Fixed a bug of JPEGCompression (@tiulpin )
    • [x] Added IntensityRemappping transform (@soupault )

    This release is not backward comparible

    Source code(tar.gz)
    Source code(zip)
  • v0.1.8(Jul 26, 2019)

  • v0.1.7(Jul 6, 2019)

    This release has some tests-related features and also adds two new transforms.

    Detailed description:

    • [x] Improved and parametrized many more testes
    • [x] Added CutOut data augmentation
    • [x] Added KeypointsJitter class that allows to apply random displacements to keypoints
    Source code(tar.gz)
    Source code(zip)
  • v0.1.6(May 25, 2019)

  • v0.1.5(Feb 15, 2019)

  • v0.1(Nov 13, 2018)

  • v0.0.6(Sep 21, 2018)

Owner
Research Unit of Medical Imaging, Physics and Technology
Research Unit of Medical Imaging, Physics and Technology
You Only Hypothesize Once: Point Cloud Registration with Rotation-equivariant Descriptors

You Only Hypothesize Once: Point Cloud Registration with Rotation-equivariant Descriptors In this paper, we propose a novel local descriptor-based fra

Haiping Wang 80 Dec 15, 2022
Numerical Methods with Python, Numpy and Matplotlib

Numerical Bric-a-Brac Collections of numerical techniques with Python and standard computational packages (Numpy, SciPy, Numba, Matplotlib ...). Diffe

Vincent Bonnet 10 Dec 20, 2021
[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation Prerequisite Please create and activate the following conda envrionment. To r

Qin Wang 87 Jan 08, 2023
Learning Dynamic Network Using a Reuse Gate Function in Semi-supervised Video Object Segmentation.

Training Script for Reuse-VOS This code implementation of CVPR 2021 paper : Learning Dynamic Network Using a Reuse Gate Function in Semi-supervised Vi

HYOJINPARK 22 Jan 01, 2023
Github project for Attention-guided Temporal Coherent Video Object Matting.

Attention-guided Temporal Coherent Video Object Matting This is the Github project for our paper Attention-guided Temporal Coherent Video Object Matti

71 Dec 19, 2022
The Balloon Learning Environment - flying stratospheric balloons with deep reinforcement learning.

Balloon Learning Environment Docs The Balloon Learning Environment (BLE) is a simulator for stratospheric balloons. It is designed as a benchmark envi

Google 87 Dec 25, 2022
Implementation of Axial attention - attending to multi-dimensional data efficiently

Axial Attention Implementation of Axial attention in Pytorch. A simple but powerful technique to attend to multi-dimensional data efficiently. It has

Phil Wang 250 Dec 25, 2022
How will electric vehicles affect traffic congestion and energy consumption: an integrated modelling approach

EV-charging-impact This repository contains the code that has been used for the Queue modelling for the paper "How will electric vehicles affect traff

7 Nov 30, 2022
System-oriented IR evaluations are limited to rather abstract understandings of real user behavior

Validating Simulations of User Query Variants This repository contains the scripts of the experiments and evaluations, simulated queries, as well as t

IR Group at Technische Hochschule Kรถln 2 Nov 23, 2022
Simple sinc interpolation in PyTorch.

Kazane: simple sinc interpolation for 1D signal in PyTorch Kazane utilize FFT based convolution to provide fast sinc interpolation for 1D signal when

Chin-Yun Yu 10 May 03, 2022
Python Algorithm Interview Book Review

ํŒŒ์ด์ฌ ์•Œ๊ณ ๋ฆฌ์ฆ˜ ์ธํ„ฐ๋ทฐ ์ฑ… ๋ฆฌ๋ทฐ ๋ฆฌ๋ทฐ IT ๋Œ€๊ธฐ์—…์— ๋“ค์–ด๊ฐ€๊ณ  ์‹ถ์€ ๋ชฉํ‘œ๊ฐ€ ์žˆ๋‹ค. ๋‚ด๊ฐ€ ๊ฟˆ๊ฟ”์˜จ ํšŒ์‚ฌ์—์„œ ์ผํ•˜๋Š” ์‚ฌ๋žŒ๋“ค์˜ ๋ชจ์Šต์„ ๋ณด๋ฉด ๋ฉ‹์žˆ๋‹ค๊ณ  ์ƒ๊ฐ์ด ๋“ค๊ณ  ๋‚˜์˜ ๋ชฉํ‘œ์— ๋Œ€ํ•œ ์—ด๋ง์ด ๊ฐ•ํ•ด์ง€๋Š” ๊ฒƒ ๊ฐ™๋‹ค. ๋ฏธ๋ž˜์˜ ํ•ต์‹ฌ ์‚ฌ์—… ์ค‘ ํ•˜๋‚˜์ธ SW ๋ถ€๋ถ„์„ ์ด๋Œ๊ณ  ๋ฐœ์ „์‹œํ‚ค๋Š” ์šฐ๋ฆฌ๋‚˜๋ผ์˜ I

SharkBSJ 1 Dec 14, 2021
IJCAI2020 & IJCV 2020 :city_sunrise: Unsupervised Scene Adaptation with Memory Regularization in vivo

Seg_Uncertainty In this repo, we provide the code for the two papers, i.e., MRNet๏ผšUnsupervised Scene Adaptation with Memory Regularization in vivo, IJ

Zhedong Zheng 348 Jan 05, 2023
IMBENS: class-imbalanced ensemble learning in Python.

IMBENS: class-imbalanced ensemble learning in Python. Links: [Documentation] [Gallery] [PyPI] [Changelog] [Source] [Download] [็ŸฅไนŽ/Zhihu] [ไธญๆ–‡README] [a

Zhining Liu 176 Jan 04, 2023
Implementation of Gans

GAN Generative Adverserial Networks are an approach to generative data modelling using Deep learning methods. I have currently implemented : DCGAN on

Sibam Parida 5 Sep 07, 2021
Instant neural graphics primitives: lightning fast NeRF and more

Instant Neural Graphics Primitives Ever wanted to train a NeRF model of a fox in under 5 seconds? Or fly around a scene captured from photos of a fact

NVIDIA Research Projects 10.6k Jan 01, 2023
This tool converts a Nondeterministic Finite Automata (NFA) into a Deterministic Finite Automata (DFA)

This tool converts a Nondeterministic Finite Automata (NFA) into a Deterministic Finite Automata (DFA)

Quinn Herden 1 Feb 04, 2022
Python Jupyter kernel using Poetry for reproducible notebooks

Poetry Kernel Use per-directory Poetry environments to run Jupyter kernels. No need to install a Jupyter kernel per Python virtual environment! The id

Pathbird 204 Jan 04, 2023
Contrastive Feature Loss for Image Prediction

Contrastive Feature Loss for Image Prediction We provide a PyTorch implementation of our contrastive feature loss presented in: Contrastive Feature Lo

Alex Andonian 44 Oct 05, 2022
[SIGIR22] Official PyTorch implementation for "CORE: Simple and Effective Session-based Recommendation within Consistent Representation Space".

CORE This is the official PyTorch implementation for the paper: Yupeng Hou, Binbin Hu, Zhiqiang Zhang, Wayne Xin Zhao. CORE: Simple and Effective Sess

RUCAIBox 26 Dec 19, 2022
Score refinement for confidence-based 3D multi-object tracking

Score refinement for confidence-based 3D multi-object tracking Our video gives a brief explanation of our Method. This is the official code for the pa

Cognitive Systems Research Group 47 Dec 26, 2022