ResNEsts and DenseNEsts: Block-based DNN Models with Improved Representation Guarantees

Last update: Oct 02, 2022

Overview

ResNEsts and DenseNEsts: Block-based DNN Models with Improved Representation Guarantees

This repository is the official implementation of the empirical research presented in the supplementary material of the paper, ResNEsts and DenseNEsts: Block-based DNN Models with Improved Representation Guarantees.

Requirements

To install requirements:

pip install -r requirements.txt

Please install Python before running the above setup command. The code was tested on Python 3.8.10.

Create a folder to store all the models and results:

mkdir ckeckpoint

Training

To fully replicate the results below, train all the models by running the following two commands:

./train_cuda0.sh

./train_cuda1.sh

We used two separate scripts because we had two NVIDIA GPUs and we wanted to run two training processes for different models at the same time. If you have more GPUs or resources, you can submit multiple jobs and let them run in parallel.

To train a model with different seeds (initializations), run the command in the following form:

python main.py --data <dataset> --model <DNN_model> --mu <learning_rate>

The above command uses the default seed list. You can also specify your seeds like the following example:

python main.py --data CIFAR10 --model CIFAR10_BNResNEst_ResNet_110 --seed_list 8 9

Run this command to see how to customize your training or hyperparameters:

python main.py --help

Evaluation

To evaluate all trained models on benchmarks reported in the tables below, run:

./eval.sh

To evaluate a model, run:

python eval.py --data  <dataset> --model <DNN_model> --seed_list <seed>

Results

Image Classification on CIFAR-10

Architecture	Standard	ResNEst	BN-ResNEst	A-ResNEst
WRN-16-8	95.58% (11M)	94.47% (11M)	95.49% (11M)	95.29% (8.7M)
WRN-40-4	95.49% (9.0M)	94.64% (9.0M)	95.62% (9.0M)	95.48% (8.4M)
ResNet-110	94.33% (1.7M)	92.62% (1.7M)	94.47% (1.7M)	93.93% (1.7M)
ResNet-20	92.58% (0.27M)	90.98% (0.27M)	92.56% (0.27M)	92.47% (0.24M)

Image Classification on CIFAR-100

Architecture	Standard	ResNEst	BN-ResNEst	A-ResNEst
WRN-16-8	79.14% (11M)	75.42% (11M)	78.98% (11M)	78.74% (8.9M)
WRN-40-4	79.08% (9.0M)	75.16% (9.0M)	78.81% (9.0M)	78.69% (8.7M)
ResNet-110	74.08% (1.7M)	69.08% (1.7M)	74.24% (1.7M)	72.53% (1.9M)
ResNet-20	68.56% (0.28M)	64.73% (0.28M)	68.49% (0.28M)	68.16% (0.27M)

BibTeX

@inproceedings{chen2021resnests,
  title={{ResNEsts} and {DenseNEsts}: Block-based {DNN} Models with Improved Representation Guarantees},
  author={Chen, Kuan-Lin and Lee, Ching-Hua and Garudadri, Harinath and Rao, Bhaskar D.},
  booktitle={Advances in Neural Information Processing Systems (NeurIPS)},
  year={2021}
}

ResNEsts and DenseNEsts: Block-based DNN Models with Improved Representation Guarantees

Related tags

Overview

ResNEsts and DenseNEsts: Block-based DNN Models with Improved Representation Guarantees

Requirements

Training

Evaluation

Results

Image Classification on CIFAR-10

Image Classification on CIFAR-100

BibTeX

Owner

Kuan-Lin (Jason) Chen

C3d-pytorch - Pytorch porting of C3D network, with Sports1M weights

Tutorials and implementations for "Self-normalizing networks"

[NeurIPS-2020] Self-paced Contrastive Learning with Hybrid Memory for Domain Adaptive Object Re-ID.

The Multi-Mission Maximum Likelihood framework (3ML)

BraTs-VNet - BraTS(Brain Tumour Segmentation) using V-Net

Solver for Large-Scale Rank-One Semidefinite Relaxations

Repo for the Video Person Clustering dataset, and code for the associated paper

Implementation for the EMNLP 2021 paper "Interactive Machine Comprehension with Dynamic Knowledge Graphs".

AdamW optimizer and cosine learning rate annealing with restarts

RSC-Net: 3D Human Pose, Shape and Texture from Low-Resolution Images and Videos

Implementation of DocFormer: End-to-End Transformer for Document Understanding, a multi-modal transformer based architecture for the task of Visual Document Understanding (VDU)

Flow is a computational framework for deep RL and control experiments for traffic microsimulation.

This is the repository for the AAAI 21 paper [Contrastive and Generative Graph Convolutional Networks for Graph-based Semi-Supervised Learning].

Image De-raining Using a Conditional Generative Adversarial Network

PyTorch implementation of "Efficient Neural Architecture Search via Parameters Sharing"

Implementation of "Semi-supervised Domain Adaptive Structure Learning"

Context-Sensitive Misspelling Correction of Clinical Text via Conditional Independence, CHIL 2022

MAVE: : A Product Dataset for Multi-source Attribute Value Extraction

ppo_pytorch_cpp - an implementation of the proximal policy optimization algorithm for the C++ API of Pytorch

An efficient and easy-to-use deep learning model compression framework