A curated list of awesome resources combining Transformers with Neural Architecture Search

Overview

Awesome Transformer Architecture Search: Awesome

To keep track of the large number of recent papers that look at the intersection of Transformers and Neural Architecture Search (NAS), we have created this awesome list of curated papers and resources, inspired by awesome-autodl, awesome-architecture-search, and awesome-computer-vision. Papers are divided into the following categories:

  1. General Transformer search
  2. Domain Specific, applied Transformer search (divided into NLP, Vision, ASR)
  3. Insights on Transformer components or searchable parameters
  4. Transformer Surveys

This repository is maintained by the AutoML Group Freiburg. Please feel free to pull requests or open an issue to add papers.

General Transformer Search

Title Venue Group
UniNet: Unified Architecture Search with Convolutions, Transformer and MLP arxiv [Oct'21] SenseTime
Analyzing and Mitigating Interference in Neural Architecture Search arxiv [Aug'21] Tsinghua, MSR
BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search ICCV'21 Sun Yat-sen University
Memory-Efficient Differentiable Transformer Architecture Search ACL-IJCNLP'21 MSR, Peking University
Finding Fast Transformers: One-Shot Neural Architecture Search by Component Composition arxiv [Aug'20] Google Research
AutoTrans: Automating Transformer Design via Reinforced Architecture Search arxiv [Sep'20] Fudan University
NAT: Neural Architecture Transformer for Accurate and Compact Architectures NeurIPS'19 Tencent AI
The Evolved Transformer ICML'19 Google Brain

Domain Specific Transformer Search

Vision

Title Venue Group
AutoFormer: Searching Transformers for Visual Recognition ICCV'21 MSR
GLiT: Neural Architecture Search for Global and Local Image Transformer ICCV'21 University of Sydney
Searching for Efficient Multi-Stage Vision Transformers ICCV'21 workshop MIT
HR-NAS: Searching Efficient High-Resolution Neural Architectures with Lightweight Transformers CVPR'21 Bytedance Inc.
Vision Transformer Architecture Search arxiv [June'21] SenseTime, Tsingua University

Natural Language Processing

Title Venue Group
AutoTinyBERT: Automatic Hyper-parameter Optimization for Efficient Pre-trained Language Models ACL'21 MIT
NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural Architecture Search KDD'21 MSR, Tsinghua University
AutoBERT-Zero: Evolving the BERT backbone from scratch arxiv [July'21] Huawei Noah’s Ark Lab
HAT: Hardware-Aware Transformers for Efficient Natural Language Processing ACL'20 MIT

Automatic Speech Recognition

Title Venue Group
LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search ICASSP'21 MSR
Darts-Conformer: Towards Efficient Gradient-Based Neural Architecture Search For End-to-End ASR arxiv [Aug'21] NPU, Xi'an
Improved Conformer-based End-to-End Speech Recognition Using Neural Architecture Search arxiv [April'21] Chinese Academy of Sciences
Evolved Speech-Transformer: Applying Neural Architecture Search to End-to-End Automatic Speech Recognition INTERSPEECH'20 VUNO Inc.

Insights on Transformer components and interesting papers

Title Venue Group
Patches are All You Need ? ICLR'22 under review -
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows ICCV'21 best paper MSR
Rethinking Spatial Dimensions of Vision Transformers ICCV'21 NAVER AI
What makes for hierarchical vision transformers arxiv [Sept'21] HUST
AutoAttend: Automated Attention Representation Search ICML'21 Tsinghua University
Rethinking Attention with Performers ICLR'21 Oral Google
LambdaNetworks: Modeling long-range Interactions without Attention ICLR'21 Google Research
HyperGrid Transformers ICLR'21 Google Research
LocalViT: Bringing Locality to Vision Transformers arxiv [April'21] ETH Zurich
NASABN: A Neural Architecture Search Framework for Attention-Based Networks IJCNN'20 Chinese Academy of Sciences
Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned ACL'19 Yandex

Transformer Surveys

Title Venue Group
Transformers in Vision: A Survey arxiv [Oct'21] MBZ University of AI
Efficient Transformers: A Survey arxiv [Sept'21] Google Research

Misc resources

Owner
Yash Mehta
Researcher, deep learning 🍁 Previously @GatsbyUCL, @NTUsingapore, @AmazonSDE
Yash Mehta
Deep Networks with Recurrent Layer Aggregation

RLA-Net: Recurrent Layer Aggregation Recurrence along Depth: Deep Networks with Recurrent Layer Aggregation This is an implementation of RLA-Net (acce

Joy Fang 21 Aug 16, 2022
PyTorch code for the paper "Curriculum Graph Co-Teaching for Multi-target Domain Adaptation" (CVPR2021)

PyTorch code for the paper "Curriculum Graph Co-Teaching for Multi-target Domain Adaptation" (CVPR2021) This repo presents PyTorch implementation of M

Evgeny 79 Dec 19, 2022
September-Assistant - Open-source Windows Voice Assistant

September - Windows Assistant September is an open-source Windows personal assis

The Nithin Balaji 9 Nov 22, 2022
L-Verse: Bidirectional Generation Between Image and Text

Far beyond learning long-range interactions of natural language, transformers are becoming the de-facto standard for many vision tasks with their power and scalabilty

Kim, Taehoon 102 Dec 21, 2022
This is a custom made virus code in python, using tkinter module.

skeleterrorBetaV0.1-Virus-code This is a custom made virus code in python, using tkinter module. This virus is not harmful to the computer, it only ma

AR 0 Nov 21, 2022
Machine learning notebooks in different subjects optimized to run in google collaboratory

Notebooks Name Description Category Link Training pix2pix This notebook shows a simple pipeline for training pix2pix on a simple dataset. Most of the

Zaid Alyafeai 363 Dec 06, 2022
tsai is an open-source deep learning package built on top of Pytorch & fastai focused on state-of-the-art techniques for time series classification, regression and forecasting.

Time series Timeseries Deep Learning Pytorch fastai - State-of-the-art Deep Learning with Time Series and Sequences in Pytorch / fastai

timeseriesAI 2.8k Jan 08, 2023
Instant-nerf-pytorch - NeRF trained SUPER FAST in pytorch

instant-nerf-pytorch This is WORK IN PROGRESS, please feel free to contribute vi

94 Nov 22, 2022
Official implementation of VQ-Diffusion

Official implementation of VQ-Diffusion: Vector Quantized Diffusion Model for Text-to-Image Synthesis

Microsoft 592 Jan 03, 2023
Flow is a computational framework for deep RL and control experiments for traffic microsimulation.

Flow Flow is a computational framework for deep RL and control experiments for traffic microsimulation. See our website for more information on the ap

867 Jan 02, 2023
Simple and Robust Loss Design for Multi-Label Learning with Missing Labels

Simple and Robust Loss Design for Multi-Label Learning with Missing Labels Official PyTorch Implementation of the paper Simple and Robust Loss Design

Xinyu Huang 28 Oct 27, 2022
YOLO5Face: Why Reinventing a Face Detector (https://arxiv.org/abs/2105.12931)

Introduction Yolov5-face is a real-time,high accuracy face detection. Performance Single Scale Inference on VGA resolution(max side is equal to 640 an

DeepCam Shenzhen 1.4k Jan 07, 2023
Source code for models described in the paper "AudioCLIP: Extending CLIP to Image, Text and Audio" (https://arxiv.org/abs/2106.13043)

AudioCLIP Extending CLIP to Image, Text and Audio This repository contains implementation of the models described in the paper arXiv:2106.13043. This

458 Jan 02, 2023
Speech Recognition using DeepSpeech2.

deepspeech.pytorch Implementation of DeepSpeech2 for PyTorch using PyTorch Lightning. The repo supports training/testing and inference using the DeepS

Sean Naren 2k Jan 04, 2023
Official Code for AdvRush: Searching for Adversarially Robust Neural Architectures (ICCV '21)

AdvRush Official Code for AdvRush: Searching for Adversarially Robust Neural Architectures (ICCV '21) Environmental Set-up Python == 3.6.12, PyTorch =

11 Dec 10, 2022
TorchX: A PyTorch Extension Library for More Efficient Deep Learning

TorchX TorchX: A PyTorch Extension Library for More Efficient Deep Learning. @misc{torchx, author = {Ansheng You and Changxu Wang}, title = {T

Donny You 8 May 28, 2022
Strongly local p-norm-cut algorithms for semi-supervised learning and local graph clustering

Strongly local p-norm-cut algorithms for semi-supervised learning and local graph clustering

Meng Liu 2 Jul 19, 2022
Official Repo for ICCV2021 Paper: Learning to Regress Bodies from Images using Differentiable Semantic Rendering

[ICCV2021] Learning to Regress Bodies from Images using Differentiable Semantic Rendering Getting Started DSR has been implemented and tested on Ubunt

Sai Kumar Dwivedi 83 Nov 27, 2022
Implementation of ReSeg using PyTorch

Implementation of ReSeg using PyTorch ReSeg: A Recurrent Neural Network-based Model for Semantic Segmentation Pascal-Part Annotations Pascal VOC 2010

Onur Kaplan 46 Nov 23, 2022
Temporally Coherent GAN SIGGRAPH project.

TecoGAN This repository contains source code and materials for the TecoGAN project, i.e. code for a TEmporally COherent GAN for video super-resolution

Duc Linh Nguyen 2 Jan 18, 2022