PyTorch implementation of "Efficient Neural Architecture Search via Parameters Sharing"

Last update: Dec 31, 2022

Overview

Efficient Neural Architecture Search (ENAS) in PyTorch

PyTorch implementation of Efficient Neural Architecture Search via Parameters Sharing.

ENAS reduce the computational requirement (GPU-hours) of Neural Architecture Search (NAS) by 1000x via parameter sharing between models that are subgraphs within a large computational graph. SOTA on Penn Treebank language modeling.

**[Caveat] Use official code from the authors: link**

Prerequisites

Python 3.6+
PyTorch==0.3.1
tqdm, scipy, imageio, graphviz, tensorboardX

Usage

Install prerequisites with:

conda install graphviz
pip install -r requirements.txt

To train ENAS to discover a recurrent cell for RNN:

python main.py --network_type rnn --dataset ptb --controller_optim adam --controller_lr 0.00035 \
               --shared_optim sgd --shared_lr 20.0 --entropy_coeff 0.0001

python main.py --network_type rnn --dataset wikitext

To train ENAS to discover CNN architecture (in progress):

python main.py --network_type cnn --dataset cifar --controller_optim momentum --controller_lr_cosine=True \
               --controller_lr_max 0.05 --controller_lr_min 0.0001 --entropy_coeff 0.1

or you can use your own dataset by placing images like:

data
├── YOUR_TEXT_DATASET
│   ├── test.txt
│   ├── train.txt
│   └── valid.txt
├── YOUR_IMAGE_DATASET
│   ├── test
│   │   ├── xxx.jpg (name doesn't matter)
│   │   ├── yyy.jpg (name doesn't matter)
│   │   └── ...
│   ├── train
│   │   ├── xxx.jpg
│   │   └── ...
│   └── valid
│       ├── xxx.jpg
│       └── ...
├── image.py
└── text.py

To generate gif image of generated samples:

python generate_gif.py --model_name=ptb_2018-02-15_11-20-02 --output=sample.gif

More configurations can be found here.

Results

Efficient Neural Architecture Search (ENAS) is composed of two sets of learnable parameters, controller LSTM θ and the shared parameters ω. These two parameters are alternatively trained and only trained controller is used to derive novel architectures.

1. Discovering Recurrent Cells

Controller LSTM decide 1) what activation function to use and 2) which previous node to connect.

The RNN cell ENAS discovered for Penn Treebank and WikiText-2 dataset:

Best discovered ENAS cell for Penn Treebank at epoch 27:

You can see the details of training (e.g. reward, entropy, loss) with:

tensorboard --logdir=logs --port=6006

2. Discovering Convolutional Neural Networks

Controller LSTM samples 1) what computation operation to use and 2) which previous node to connect.

The CNN network ENAS discovered for CIFAR-10 dataset:

(in progress)

3. Designing Convolutional Cells

(in progress)

Reference

Author

Taehoon Kim / @carpedm20

PyTorch implementation of "Efficient Neural Architecture Search via Parameters Sharing"

Related tags

Overview

Efficient Neural Architecture Search (ENAS) in PyTorch

Prerequisites

Usage

Results

1. Discovering Recurrent Cells

2. Discovering Convolutional Neural Networks

3. Designing Convolutional Cells

Reference

Author

Owner

Taehoon Kim

AI-Bot - 一个基于watermelon改造的OpenAI-GPT-2的智能机器人

A Simulation Environment to train Robots in Large Realistic Interactive Scenes

"Structure-Augmented Text Representation Learning for Efficient Knowledge Graph Completion"(WWW 2021)

PyTorch implementation of DUL (Data Uncertainty Learning in Face Recognition, CVPR2020)

We will release the code of "ConTNet: Why not use convolution and transformer at the same time?" in this repo

Trained on Simulated Data, Tested in the Real World

A convolutional recurrent neural network for classifying A/B phases in EEG signals recorded for sleep analysis.

Sum-Product Probabilistic Language

Jupyter notebooks for using & learning Keras

Directed Greybox Fuzzing with AFL

Source code of CIKM2021 Long Paper "PSSL: Self-supervised Learning for Personalized Search with Contrastive Sampling".

Implementation of "JOKR: Joint Keypoint Representation for Unsupervised Cross-Domain Motion Retargeting"

This PyTorch package implements MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation (NAACL 2022).

FairyTailor: Multimodal Generative Framework for Storytelling

[BMVC2021] The official implementation of "DomainMix: Learning Generalizable Person Re-Identification Without Human Annotations"

Code for reproducible experiments presented in KSD Aggregated Goodness-of-fit Test.

Adversarial Texture Optimization from RGB-D Scans (CVPR 2020).

Tutorial to set up TensorFlow Object Detection API on the Raspberry Pi

Code for "Reconstructing 3D Human Pose by Watching Humans in the Mirror", CVPR 2021 oral

FastFCN: Rethinking Dilated Convolution in the Backbone for Semantic Segmentation.