CONetV2: Efficient Auto-Channel Size Optimization for CNNs

Related tags

Deep LearningCONetV2
Overview

CONetV2: Efficient Auto-Channel Size Optimization for CNNs

Exciting News! CONetV2: Efficient Auto-Channel Size Optimization for CNNs has been accepted to the International Conference on Machine Learning and Applications (ICMLA) 2021 for Oral Presentation!

CONetV2: Efficient Auto-Channel Size Optimization for CNNs,
Yi Ru Wang, Samir Khaki, Weihang Zheng, Mahdi S. Hosseini, Konstantinos N. Plataniotis
In Proceedings of the IEEE International Conference on Machine Learning and Applications (ICMLA)

Checkout our arXiv preprint: Paper

Overview

Neural Architecture Search (NAS) has been pivotal in finding optimal network configurations for Convolution Neural Networks (CNNs). While many methods explore NAS from a global search space perspective, the employed optimization schemes typically require heavy computation resources. Instead, our work excels in computationally constrained environments by examining the micro-search space of channel size, the optimization of which is effective in outperforming baselines. In tackling channel size optimization, we design an automated algorithm to extract the dependencies within channel sizes of different connected layers. In addition, we introduce the idea of Knowledge Distillation, which enables preservation of trained weights, admist trials where the channel sizes are changing. As well, because standard performance indicators (accuracy, loss) fails to capture the performance of individual network components, we introduce a novel metric that has high correlation with test accuracy and enables analysis of individual network layers. Combining Dependency Extraction, metrics, and knowledge distillation, we introduce an efficient search algorithm, with simulated annealing inspired stochasticity, and demonstrate its effectiveness in outperforming baselines by a large margin, while only utilizing a fraction of the trainable parameters.

Results

We report our results below for ResNet34. On the left we provide a comparison of our method compared to the baseline, compared to Compound Scaling and Random Optimization. On the right we compare the two variations of our method: Simulated Annealing (Left), Greedy (Right). For further experiments and results, please refer to our paper.

Accuracy vs. Parameters Channel Evolution Comparison

Table of Contents

Getting Started

Dependencies

  • Requirements are specified in requirements.txt
certifi==2020.6.20
cycler==0.10.0
et-xmlfile==1.0.1
future==0.18.2
graphviz==0.14.2
jdcal==1.4.1
kiwisolver==1.2.0
matplotlib==3.3.2
memory-profiler==0.57.0
numpy==1.19.2
openpyxl==3.0.5
pandas==1.1.3
Pillow==8.0.0
pip==18.1
pkg-resources==0.0.0
psutil==5.7.2
ptflops==0.6.2
pyparsing==2.4.7
python-dateutil==2.8.1
pytz ==2020.1
PyYAML==5.3.1
scipy==1.5.2
setuptools==40.8.0
six==1.15.0
torch==1.6.0
torchvision==0.7.0
torchviz==0.0.1
wheel==0.35.1
xlrd==1.2.0

Executing program

To run the main searching script for searching on ResNet34:

cd CONetV2
python main.py --config='./configs/config_resnet.yaml' --gamma=0.8 --optimization_algorithm='SA' --post_fix=1

We also provide a script for training using slurm in slurm_scripts/run.sh. Update parameters on Line 6, 9, and 10 to use.

sbatch slurm_scripts/run.sh

Options for Training

--config CONFIG             # Set root path of project that parents all others:
                            Default = './configs/config.yaml'
--data DATA_PATH            # Set data directory path: 
                            Default = '.adas-data'
--output OUTPUT_PATH        # Set the directory for output files,  
                            Default = 'adas_search'
--root ROOT                 # Set root path of project that parents all others: 
                            Default = '.'
--model MODEL_TYPE          # Set the model type for searching {'resnet34', 'darts'}
                            Default = None
--gamma                     # Momentum tuning factor
                            Default = None
--optimization_algorithm    # Type of channel search algorithm {'greedy', 'SA'}
                            Default = None

Training Output

All training output will be saved to the OUTPUT_PATH location. After a full experiment, results will be recorded in the following format:

  • OUTPUT_PATH/EXPERIMENT_FOLDER
    • full_train
      • performance.xlsx: results for the full train, including GMac, Parameters(M), and accuracies & losses (Train & Test) per epoch.
    • Trials
      • adapted_architectures.xlsx: channel size evolution per convolution layer throughout searching trials.
      • trial_{n}.xlsx: Details of the particular trial, including metric values for every epoch within the trial.
    • ckpt.pth: Checkpoint of the model which achieved the highest test accuracy during full train.

Code Organization

Configs

We provide the configuration files for ResNet34 and DARTS7 for running automated channel size search.

  • configs/config_resnet.yaml
  • configs/config_darts.yaml

Dependency Extraction

Code for dependency extraction are in three primary modules: model to adjacency list conversion, adjacency list to linked list conversion, and linked list to dependency list conversion.

  • dependency/LLADJ.py: Functions for a variety of skeleton models for automated adjacency list extraction given pytorch model instance.
  • dependency/LinkedListConstructor.py: Automated conversion of a adjacency list representation to linked list.
  • dependency/getDependency.py: Extract dependencies based on linked list representation.

Metrics

Code for computing several metrics. Note that we use the QC Metric.

  • metrics/components.py: Helper functions for computing metrics
  • metrics/metrics.py: Script for computing different metrics

Models

Code for all supported models: ResNet34 and Darts7

  • models/darts.py: Pytorch construction of the Darts7 Model Architecture.
  • models/resnet.py: Pytorch construction of the ResNet34 Model Architecture

Optimizers

Code for all optimizer options and learning rate schedulers for training networks. Options include: AdaS, SGD, StepLR, MultiStepLR, CosineAnnealing, etc.

  • optim/*

Scaling Method

Channel size scaling algorithm between trials.

  • scaling_method/default_scaling.py: Contains the functions for scaling of channel sizes based on computed metrics.

Searching Algorithm

Code for channel size searching algorithms.

  • searching_algorithm/common.py: Common functions used for searching algorithms.
  • searching_algorithm/greedy.py: Greedy way of searching for channel sizes, always steps in the direction that yields the optimal local solution.
  • searching_algorithm/simulated_annealing.py: Simulated annealing inspired searching, induced stochasticity with magnitute of scaling.

Visualization

Helper functions for visualization of metric evolution.

  • visualization/draw_channel_scaling.py: visualization of channel size evolution.
  • visualization/plotting_layers_by_trial.py: visualization of layer channel size changes across different search trials.
  • visualization/plotting_metric_by_trial.py: visualization of metric evolution for different layers across search trials.
  • visualization/plotting_metric_by_epoch.py: visualization of metric evolution through the epochs during full train.

Utils

Helper functions for training.

  • utils/create_dataframe.py: Constructs dataframes for storing output files.
  • utils/test.py: Running accuracy and loss tests per epoch.
  • utils/train_helpers.py: Helper functions for training epochs.
  • utils/utils.py: Helper functions.
  • utils/weight_transfer.py: Function to execute knowledge distillation across trials.

Version History

  • 0.1
    • Initial Release
Owner
Mahdi S. Hosseini
Assistant Professor in ECE Department at University of New Brunswick. My research interests cover broad topics in Machine Learning and Computer Vision problems
Mahdi S. Hosseini
Image transformations designed for Scene Text Recognition (STR) data augmentation. Published at ICCV 2021 Workshop on Interactive Labeling and Data Augmentation for Vision.

Data Augmentation for Scene Text Recognition (ICCV 2021 Workshop) (Pronounced as "strog") Paper Arxiv Why it matters? Scene Text Recognition (STR) req

Rowel Atienza 152 Dec 28, 2022
Continual World is a benchmark for continual reinforcement learning

Continual World Continual World is a benchmark for continual reinforcement learning. It contains realistic robotic tasks which come from MetaWorld. Th

41 Dec 24, 2022
Code and model benchmarks for "SEVIR : A Storm Event Imagery Dataset for Deep Learning Applications in Radar and Satellite Meteorology"

NeurIPS 2020 SEVIR Code for paper: SEVIR : A Storm Event Imagery Dataset for Deep Learning Applications in Radar and Satellite Meteorology Requirement

USAF - MIT Artificial Intelligence Accelerator 46 Dec 15, 2022
[CVPR 2022 Oral] MixFormer: End-to-End Tracking with Iterative Mixed Attention

MixFormer The official implementation of the CVPR 2022 paper MixFormer: End-to-End Tracking with Iterative Mixed Attention [Models and Raw results] (G

Multimedia Computing Group, Nanjing University 235 Jan 03, 2023
MINIROCKET: A Very Fast (Almost) Deterministic Transform for Time Series Classification

MINIROCKET: A Very Fast (Almost) Deterministic Transform for Time Series Classification

187 Dec 26, 2022
Author Disambiguation using Knowledge Graph Embeddings with Literals

Author Name Disambiguation with Knowledge Graph Embeddings using Literals This is the repository for the master thesis project on Knowledge Graph Embe

12 Oct 19, 2022
The 3rd place solution for competition

The 3rd place solution for competition "Lyft Motion Prediction for Autonomous Vehicles" at Kaggle Team behind this solution: Artsiom Sanakoyeu [Homepa

Artsiom 104 Nov 22, 2022
GPOEO is a micro-intrusive GPU online energy optimization framework for iterative applications

GPOEO GPOEO is a micro-intrusive GPU online energy optimization framework for iterative applications. We also implement ODPP [1] as a comparison. [1]

瑞雪轻飏 8 Sep 10, 2022
Extreme Rotation Estimation using Dense Correlation Volumes

Extreme Rotation Estimation using Dense Correlation Volumes This repository contains a PyTorch implementation of the paper: Extreme Rotation Estimatio

Ruojin Cai 29 Nov 18, 2022
https://sites.google.com/cornell.edu/recsys2021tutorial

Counterfactual Learning and Evaluation for Recommender Systems (RecSys'21 Tutorial) Materials for "Counterfactual Learning and Evaluation for Recommen

yuta-saito 45 Nov 10, 2022
House-GAN++: Generative Adversarial Layout Refinement Network towards Intelligent Computational Agent for Professional Architects

House-GAN++ Code and instructions for our paper: House-GAN++: Generative Adversarial Layout Refinement Network towards Intelligent Computational Agent

122 Dec 28, 2022
Repositorio oficial del curso IIC2233 Programación Avanzada 🚀✨

IIC2233 - Programación Avanzada Evaluación Las evaluaciones serán efectuadas por medio de actividades prácticas en clases y tareas. Se calculará la no

IIC2233 @ UC 47 Sep 06, 2022
Intel® Nervana™ reference deep learning framework committed to best performance on all hardware

DISCONTINUATION OF PROJECT. This project will no longer be maintained by Intel. Intel will not provide or guarantee development of or support for this

Nervana 3.9k Dec 20, 2022
Hierarchical Time Series Forecasting with a familiar API

scikit-hts Hierarchical Time Series with a familiar API. This is the result from not having found any good implementations of HTS on-line, and my work

Carlo Mazzaferro 204 Dec 17, 2022
QuanTaichi evaluation suite

QuanTaichi: A Compiler for Quantized Simulations (SIGGRAPH 2021) Yuanming Hu, Jiafeng Liu, Xuanda Yang, Mingkuan Xu, Ye Kuang, Weiwei Xu, Qiang Dai, W

Taichi Developers 120 Jan 04, 2023
Code for DeepXML: A Deep Extreme Multi-Label Learning Framework Applied to Short Text Documents

DeepXML Code for DeepXML: A Deep Extreme Multi-Label Learning Framework Applied to Short Text Documents Architectures and algorithms DeepXML supports

Extreme Classification 49 Nov 06, 2022
Video-based open-world segmentation

UVO_Challenge Team Alpes_runner Solutions This is an official repo for our UVO Challenge solutions for Image/Video-based open-world segmentation. Our

Yuming Du 84 Dec 22, 2022
PyTorch wrapper for Taichi data-oriented class

Stannum PyTorch wrapper for Taichi data-oriented class PRs are welcomed, please see TODOs. Usage from stannum import Tin import torch data_oriented =

86 Dec 23, 2022
Code for DeepCurrents: Learning Implicit Representations of Shapes with Boundaries

DeepCurrents | Webpage | Paper DeepCurrents: Learning Implicit Representations of Shapes with Boundaries David Palmer*, Dmitriy Smirnov*, Stephanie Wa

Dima Smirnov 36 Dec 08, 2022
ImageNet-CoG is a benchmark for concept generalization. It provides a full evaluation framework for pre-trained visual representations which measure how well they generalize to unseen concepts.

The ImageNet-CoG Benchmark Project Website Paper (arXiv) Code repository for the ImageNet-CoG Benchmark introduced in the paper "Concept Generalizatio

NAVER 23 Oct 09, 2022