Facilitates implementing deep neural-network backbones, data augmentations

Overview

Introduction

Nowadays, the training of Deep Learning models is fragmented and unified. When AI engineers face up with one specific task, the common way was to find a repo and reimplement them. Thus, it is really hard for them to speed up the implementation of a big project in which requires a continuous try-end-error process to find the best model. general_backbone is launched to facilitate for implementation of deep neural-network backbones, data augmentations, optimizers, and learning schedulers that all in one package. Finally, you can quick-win the training process. Below are these supported sectors in the current version:

  • backbones
  • loss functions
  • augumentation styles
  • optimizers
  • schedulers
  • data types
  • visualizations

Installation

Refer to docs/installation.md for installion of general_backbone package.

Model backbone

Currently, general_backbone supports more than 70 type of resnet models such as: resnet18, resnet34, resnet50, resnet101, resnet152, resnext50.

All models is supported can be found in general_backbone.list_models() function:

import general_backbone
general_backbone.list_models()

Results

{'resnet': ['resnet18', 'resnet18d', 'resnet34', 'resnet34d', 'resnet26', 'resnet26d', 'resnet26t', 'resnet50', 'resnet50d', 'resnet50t', 'resnet101', 'resnet101d', 'resnet152', 'resnet152d', 'resnet200', 'resnet200d', 'tv_resnet34', 'tv_resnet50', 'tv_resnet101', 'tv_resnet152', 'wide_resnet50_2', 'wide_resnet101_2', 'resnext50_32x4d', 'resnext50d_32x4d', 'resnext101_32x4d', 'resnext101_32x8d', 'resnext101_64x4d', 'tv_resnext50_32x4d', 'ig_resnext101_32x8d', 'ig_resnext101_32x16d', 'ig_resnext101_32x32d', 'ig_resnext101_32x48d', 'ssl_resnet18', 'ssl_resnet50', 'ssl_resnext50_32x4d', 'ssl_resnext101_32x4d', 'ssl_resnext101_32x8d', 'ssl_resnext101_32x16d', 'swsl_resnet18', 'swsl_resnet50', 'swsl_resnext50_32x4d', 'swsl_resnext101_32x4d', 'swsl_resnext101_32x8d', 'swsl_resnext101_32x16d', 'seresnet18', 'seresnet34', 'seresnet50', 'seresnet50t', 'seresnet101', 'seresnet152', 'seresnet152d', 'seresnet200d', 'seresnet269d', 'seresnext26d_32x4d', 'seresnext26t_32x4d', 'seresnext50_32x4d', 'seresnext101_32x4d', 'seresnext101_32x8d', 'senet154', 'ecaresnet26t', 'ecaresnetlight', 'ecaresnet50d', 'ecaresnet50d_pruned', 'ecaresnet50t', 'ecaresnet101d', 'ecaresnet101d_pruned', 'ecaresnet200d', 'ecaresnet269d', 'ecaresnext26t_32x4d', 'ecaresnext50t_32x4d', 'resnetblur18', 'resnetblur50', 'resnetrs50', 'resnetrs101', 'resnetrs152', 'resnetrs200', 'resnetrs270', 'resnetrs350', 'resnetrs420']}

To select your backbone type, you set model=resnet50 in train_config of your config file. An example config file general_backbone/configs/image_clf_config.py.

Dataset

A toy dataset is provided at toydata for your test training. It has a structure organized as below:

toydata/
└── image_classification
    ├── test
    │   ├── cat
    │   └── dog
    └── train
        ├── cat
        └── dog

Inside each folder cat and dog is the images. If you want to add a new class, you just need to create a new folder with the folder's name is label name inside train and test folder.

Data Augmentation

general_backbone package support many augmentations style for training. It is efficient and important to improve model accuracy. Some of common augumentations is below:

Augumentation Style Parameters Description
Pixel-level transforms
Blur {'blur_limit':7, 'always_apply':False, 'p':0.5} Blur the input image using a random-sized kernel
GaussNoise {'var_limit':(10.0, 50.0), 'mean':0, 'per_channel':True, 'always_apply':False, 'p':0.5} Apply gaussian noise to the input image
GaussianBlur {'blur_limit':(3, 7), 'sigma_limit':0, 'always_apply':False, 'p':0.5} Blur the input image using a Gaussian filter with a random kernel size
GlassBlur {'sigma': 0.7, 'max_delta':4, 'iterations':2, 'always_apply':False, 'mode':'fast', 'p':0.5} Apply glass noise to the input image
HueSaturationValue {'hue_shift_limit':20, 'sat_shift_limit':30, 'val_shift_limit':20, 'always_apply':False, 'p':0.5} Randomly change hue, saturation and value of the input image
MedianBlur {'blur_limit':7, 'always_apply':False, 'p':0.5} Blur the input image using a median filter with a random aperture linear size
RGBShift {'r_shift_limit': 15, 'g_shift_limit': 15, 'b_shift_limit': 15, 'p': 0.5} Randomly shift values for each channel of the input RGB image.
Normalize {'mean':(0.485, 0.456, 0.406), 'std':(0.229, 0.224, 0.225)} Normalization is applied by the formula: img = (img - mean * max_pixel_value) / (std * max_pixel_value)
Spatial-level transforms
RandomCrop {'height':128, 'width':128} Crop a random part of the input
VerticalFlip {'p': 0.5} Flip the input vertically around the x-axis
ShiftScaleRotate {'shift_limit':0.05, 'scale_limit':0.05, 'rotate_limit':15, 'p':0.5} Randomly apply affine transforms: translate, scale and rotate the input
RandomBrightnessContrast {'brightness_limit':0.2, 'contrast_limit':0.2, 'brightness_by_max':True, 'always_apply':False,'p': 0.5} Randomly change brightness and contrast of the input image

Augumentation is configured in the configuration file general_backbone/configs/image_clf_config.py:

data_conf = dict(
    dict_transform = dict(
        SmallestMaxSize={'max_size': 160},
        ShiftScaleRotate={'shift_limit':0.05, 'scale_limit':0.05, 'rotate_limit':15, 'p':0.5},
        RandomCrop={'height':128, 'width':128},
        RGBShift={'r_shift_limit': 15, 'g_shift_limit': 15, 'b_shift_limit': 15, 'p': 0.5},
        RandomBrightnessContrast={'p': 0.5},
        Normalize={'mean':(0.485, 0.456, 0.406), 'std':(0.229, 0.224, 0.225)},
        ToTensorV2={'always_apply':True}
    )
)

You can add a new transformation step in data_conf['dict_transform'] and they are transformed in order from top-down. You can also debug your transformation by setup debug=True:

from general_backbone.data import AugmentationDataset
augdataset = AugmentationDataset(data_dir='toydata/image_classification',
                            name_split='train',
                            config_file = 'general_backbone/configs/image_clf_config.py', 
                            dict_transform=None, 
                            input_size=(256, 256), 
                            debug=True, 
                            dir_debug = 'tmp/alb_img_debug', 
                            class_2_idx=None)

for i in range(50):
    img, label = augdataset.__getitem__(i)

In default, the augmentation images output is saved in tmp/alb_img_debug to you review before train your models. the code tests augmentation image is available in debug/transform_debug.py:

conda activate gen_backbone
python debug/transform_debug.py

Train model

To train model, you run file tools/train.py. There are variaty of config for your training such as --model, --batch_size, --opt, --loss, --sched. We supply to you a standard configuration file to train your model through --config. general_backbone/configs/image_clf_config.py is for image classification task. You can change value inside this file or add new parameter as you want but without changing the name and structure of file.

python3 tools/train.py --config general_backbone/configs/image_clf_config.py

Results:

Model resnet50 created, param count:25557032
Train: 0 [   0/33 (  0%)]  Loss: 8.863 (8.86)  Time: 1.663s,    9.62/s  (1.663s,    9.62/s)  LR: 5.000e-04  Data: 0.460 (0.460)
Train: 0 [  32/33 (100%)]  Loss: 1.336 (4.00)  Time: 0.934s,    8.57/s  (0.218s,   36.68/s)  LR: 5.000e-04  Data: 0.000 (0.014)
Test: [   0/29]  Time: 0.560 (0.560)  Loss:  0.6912 (0.6912)  [email protected]: 87.5000 (87.5000)  [email protected]: 100.0000 (100.0000)
Test: [  29/29]  Time: 0.041 (0.064)  Loss:  0.5951 (0.5882)  [email protected]: 81.2500 (87.5000)  [email protected]: 100.0000 (99.3750)
Train: 1 [   0/33 (  0%)]  Loss: 0.5741 (0.574)  Time: 0.645s,   24.82/s  (0.645s,   24.82/s)  LR: 5.000e-04  Data: 0.477 (0.477)
Train: 1 [  32/33 (100%)]  Loss: 0.5411 (0.313)  Time: 0.089s,   90.32/s  (0.166s,   48.17/s)  LR: 5.000e-04  Data: 0.000 (0.016)
Test: [   0/29]  Time: 0.537 (0.537)  Loss:  0.3071 (0.3071)  [email protected]: 87.5000 (87.5000)  [email protected]: 100.0000 (100.0000)
Test: [  29/29]  Time: 0.043 (0.066)  Loss:  0.1036 (0.1876)  [email protected]: 100.0000 (93.9583)  [email protected]: 100.0000 (100.0000)

Table of config parameters is in training.

Your model checkpoint and log are saved in the same path of --output directory. A tensorboard visualization is created in order to facilitate manage and control training process. As default, folder of tensorboard is runs that insides --output. The loss, accuracy, learning rate and batch time on both train and test are logged:

tensorboard --logdir checkpoint/resnet50/20211023-092651-resnet50-224/runs/

Inference

To inference model, you can pass relevant values to --img, --config and --initial-checkpoint.

python tools/inference.py --img demo/cat0.jpg --config general_backbone/configs/image_clf_config.py --initial-checkpoint checkpoint.pth.tar

TODO

Packages reference:

There are many open sources package we refered to build up general_backbone:

  • timm: PyTorch Image Models (timm) is a collection of image models, layers, utilities, optimizers, schedulers, data-loaders / augmentations, and reference training / validation scripts that aim to pull together a wide variety of SOTA models with ability to reproduce ImageNet training results.

  • albumentations: is a Python library for image augmentation.

  • mmcv: MMCV is a foundational library for computer vision research and supports many research projects.

Citation

If you find this project is useful in your reasearch, kindly consider cite:

@article{genearal_backbone,
    title={GeneralBackbone:  A handy package for implementing Deep Learning Backbone},
    author={khanhphamdinh},
    email= {[email protected]},
    year={2021}
}
You might also like...
BisQue is a web-based platform designed to provide researchers with organizational and quantitative analysis tools for 5D image data. Users can extend BisQue by implementing containerized ML workflows.
BisQue is a web-based platform designed to provide researchers with organizational and quantitative analysis tools for 5D image data. Users can extend BisQue by implementing containerized ML workflows.

Overview BisQue is a web-based platform specifically designed to provide researchers with organizational and quantitative analysis tools for up to 5D

This is a model made out of Neural Network specifically a Convolutional Neural Network model
This is a model made out of Neural Network specifically a Convolutional Neural Network model

This is a model made out of Neural Network specifically a Convolutional Neural Network model. This was done with a pre-built dataset from the tensorflow and keras packages. There are other alternative libraries that can be used for this purpose, one of which is the PyTorch library.

Bayesian-Torch is a library of neural network layers and utilities extending the core of PyTorch to enable the user to perform stochastic variational inference in Bayesian deep neural networks

Bayesian-Torch is a library of neural network layers and utilities extending the core of PyTorch to enable the user to perform stochastic variational inference in Bayesian deep neural networks. Bayesian-Torch is designed to be flexible and seamless in extending a deterministic deep neural network architecture to corresponding Bayesian form by simply replacing the deterministic layers with Bayesian layers.

Deep learning (neural network) based remote photoplethysmography: how to extract pulse signal from video using deep learning tools

Deep-rPPG: Camera-based pulse estimation using deep learning tools Deep learning (neural network) based remote photoplethysmography: how to extract pu

This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CNPs), Neural Processes (NPs), Attentive Neural Processes (ANPs).

The Neural Process Family This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CN

FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.
FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

Detectron is deprecated. Please see detectron2, a ground-up rewrite of Detectron in PyTorch. Detectron Detectron is Facebook AI Research's software sy

A python library for implementing a recommender system

python-recsys A python library for implementing a recommender system. Installation Dependencies python-recsys is build on top of Divisi2, with csc-pys

Library for implementing reservoir computing models (echo state networks) for multivariate time series classification and clustering.
Library for implementing reservoir computing models (echo state networks) for multivariate time series classification and clustering.

Framework overview This library allows to quickly implement different architectures based on Reservoir Computing (the family of approaches popularized

PyTorch Autoencoders - Implementing a Variational Autoencoder (VAE) Series in Pytorch.

PyTorch Autoencoders Implementing a Variational Autoencoder (VAE) Series in Pytorch. Inspired by this repository Model List check model paper conferen

Releases(v0.2.1)
Code for EMNLP 2021 paper Contrastive Out-of-Distribution Detection for Pretrained Transformers.

Contra-OOD Code for EMNLP 2021 paper Contrastive Out-of-Distribution Detection for Pretrained Transformers. Requirements PyTorch Transformers datasets

Wenxuan Zhou 27 Oct 28, 2022
Machine learning evaluation metrics, implemented in Python, R, Haskell, and MATLAB / Octave

Note: the current releases of this toolbox are a beta release, to test working with Haskell's, Python's, and R's code repositories. Metrics provides i

Ben Hamner 1.6k Dec 26, 2022
Implementation of Neural Style Transfer in Pytorch

PytorchNeuralStyleTransfer Code to run Neural Style Transfer from our paper Image Style Transfer Using Convolutional Neural Networks. Also includes co

Leon Gatys 396 Dec 01, 2022
Heterogeneous Temporal Graph Neural Network

Heterogeneous Temporal Graph Neural Network This repository contains the datasets and source code of HTGNN. run_mag.ipynb is the training and testing

15 Dec 22, 2022
Double pendulum simulator using a symplectic Euler's method and Hamiltonian mechanics

Symplectic Double Pendulum Simulator Double pendulum simulator using a symplectic Euler's method. The program calculates the momentum and position of

Scott Marino 1 Jan 12, 2022
Privacy as Code for DSAR Orchestration: Privacy Request automation to fulfill GDPR, CCPA, and LGPD data subject requests.

Meet Fidesops: Privacy as Code for DSAR Orchestration A part of the greater Fides ecosystem. ⚡ Overview Fidesops (fee-dez-äps, combination of the Lati

Ethyca 44 Dec 06, 2022
Özlem Taşkın 0 Feb 23, 2022
DecoupledNet is semantic segmentation system which using heterogeneous annotations

DecoupledNet: Decoupled Deep Neural Network for Semi-supervised Semantic Segmentation Created by Seunghoon Hong, Hyeonwoo Noh and Bohyung Han at POSTE

Hyeonwoo Noh 74 Sep 22, 2021
Inverse Optimal Control Adapted to the Noise Characteristics of the Human Sensorimotor System

Inverse Optimal Control Adapted to the Noise Characteristics of the Human Sensorimotor System This repository contains code for the paper Schultheis,

2 Oct 28, 2022
UNION: An Unreferenced Metric for Evaluating Open-ended Story Generation

UNION Automatic Evaluation Metric described in the paper UNION: An UNreferenced MetrIc for Evaluating Open-eNded Story Generation (EMNLP 2020). Please

50 Dec 30, 2022
On-device speech-to-intent engine powered by deep learning

Rhino Made in Vancouver, Canada by Picovoice Rhino is Picovoice's Speech-to-Intent engine. It directly infers intent from spoken commands within a giv

Picovoice 510 Dec 30, 2022
From Fidelity to Perceptual Quality: A Semi-Supervised Approach for Low-Light Image Enhancement (CVPR'2020)

Under-exposure introduces a series of visual degradation, i.e. decreased visibility, intensive noise, and biased color, etc. To address these problems, we propose a novel semi-supervised learning app

Yang Wenhan 117 Jan 03, 2023
A library for optimization on Riemannian manifolds

TensorFlow RiemOpt A library for manifold-constrained optimization in TensorFlow. Installation To install the latest development version from GitHub:

Oleg Smirnov 83 Dec 27, 2022
🐸STT integration examples

🐸 STT 0.9.x Examples These are various examples on how to use or integrate 🐸 STT using our packages. It is a good way to just try out 🐸 STT before

coqui 92 Dec 19, 2022
Hand-distance-measurement-game - Hand Distance Measurement Game

Hand Distance Measurement Game This is program is made to calculate the distance

Priyansh 2 Jan 12, 2022
Dense Deep Unfolding Network with 3D-CNN Prior for Snapshot Compressive Imaging, ICCV2021 [PyTorch Code]

Dense Deep Unfolding Network with 3D-CNN Prior for Snapshot Compressive Imaging, ICCV2021 [PyTorch Code]

Jian Zhang 20 Oct 24, 2022
Code for "ATISS: Autoregressive Transformers for Indoor Scene Synthesis", NeurIPS 2021

ATISS: Autoregressive Transformers for Indoor Scene Synthesis This repository contains the code that accompanies our paper ATISS: Autoregressive Trans

138 Dec 22, 2022
Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network

Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network This repository is the official implementation of Speech Separati

Kai Li (李凯) 116 Nov 09, 2022
Solution of Kaggle competition: Sartorius - Cell Instance Segmentation

Sartorius - Cell Instance Segmentation https://www.kaggle.com/c/sartorius-cell-instance-segmentation Environment setup Build docker image bash .dev_sc

68 Dec 09, 2022
Concept drift monitoring for HA model servers.

{Fast, Correct, Simple} - pick three Easily compare training and production ML data & model distributions Goals Boxkite is an instrumentation library

98 Dec 15, 2022