[NeurIPS'20] Multiscale Deep Equilibrium Models

Related tags

Deep Learningmdeq
Overview

Multiscale Deep Equilibrium Models

💥 💥 💥 💥

This repo is deprecated and we will soon stop actively maintaining it, as a more up-to-date (and simpler & more efficient) implementation of MDEQ with the same set of tasks as here is now available in the DEQ repo.

We STRONGLY recommend using with the MDEQ-Vision code in the DEQ repo (which also supports Jacobian-related analysis).

💥 💥 💥 💥


This repository contains the code for the multiscale deep equilibrium (MDEQ) model proposed in the paper Multiscale Deep Equilibrium Models by Shaojie Bai, Vladlen Koltun and J. Zico Kolter.

Is implicit deep learning relevant for general, large-scale pattern recognition tasks? We propose the multiscale deep equilibrium (MDEQ) model, which expands upon the DEQ formulation substantially to introduce simultaneous equilibrium modeling of multiple signal resolutions. Specifically, MDEQ solves for and backpropagates through synchronized equilibria of multiple feature representation streams. Such structure rectifies one of the major drawbacks of DEQ, and provide natural hierarchical interfaces for auxiliary losses and compound training procedures (e.g., pretraining and finetuning). Our experiment demonstrate for the first time that "shallow" implicit models can scale to and achieve near-SOTA results on practical computer vision tasks (e.g., megapixel images on Cityscapes segmentation).

We provide in this repo the implementation and the links to the pretrained classification & segmentation MDEQ models.

If you find thie repository useful for your research, please consider citing our work:

@inproceedings{bai2020multiscale,
    author    = {Shaojie Bai and Vladlen Koltun and J. Zico Kolter},
    title     = {Multiscale Deep Equilibrium Models},
    booktitle   = {Advances in Neural Information Processing Systems (NeurIPS)},
    year      = {2020},
}

Overview

The structure of a multiscale deep equilibrium model (MDEQ) is shown below. All components of the model are shown in this figure (in practice, we use n=4).

Examples

Some examples of MDEQ segmentation results on the Cityscapes dataset.

Requirements

PyTorch >=1.4.0, torchvision >= 0.4.0

Datasets

  • CIFAR-10: We download the CIFAR-10 dataset using PyTorch's torchvision package (included in this repo).
  • ImageNet We follow the implementation from the PyTorch ImageNet Training repo.
  • Cityscapes: We download the Cityscapes dataset from its official website and process it according to this repo. Cityscapes dataset additionally require a list folder that aligns each original image with its corresponding labeled segmented image. This list folder can be downloaded here.

All datasets should be downloaded, processed and put in the respective data/[DATASET_NAME] directory. The data/ directory should look like the following:

data/
  cityscapes/
  imagenet/
  ...          (other datasets)
  list/        (see above)

Usage

All experiment settings are provided in the .yaml files under the experiments/ folder.

To train an MDEQ classification model on ImageNet/CIFAR-10, do

python tools/cls_train.py --cfg experiments/[DATASET_NAME]/[CONFIG_FILE_NAME].yaml

To train an MDEQ segmentation model on Cityscapes, do

python -m torch.distributed.launch --nproc_per_node=4 tools/seg_train.py --cfg experiments/[DATASET_NAME]/[CONFIG_FILE_NAME].yaml

where you should provide the pretrained ImageNet model path in the corresponding configuration (.yaml) file. We provide a sample pretrained model extractor in pretrained_models/, but you can also write your own script.

Similarly, to test the model and generate segmentation results on Cityscapes, do

python tools/seg_test.py --cfg experiments/[DATASET_NAME]/[CONFIG_FILE_NAME].yaml

You can (and probably should) initiate the Cityscapes training with an ImageNet-pretrained MDEQ. You need to extract the state dict from the ImageNet checkpointed model, and set the MODEL.PRETRAINED entry in Cityscapes yaml file to this state dict on disk.

The model implementation and MDEQ's algorithmic components (e.g., L-Broyden's method) can be found in lib/.

Pre-trained Models

We provide some reasonably good pre-trained weights here so that one can quickly play with DEQs without training from scratch.

Description Task Dataset Model
MDEQ-XL ImageNet Classification ImageNet download (.pkl)
MDEQ-XL Cityscapes(val) Segmentation Cityscapes download (.pkl)
MDEQ-Small ImageNet Classification ImageNet download (.pkl)
MDEQ-Small Cityscapes(val) Segmentation Cityscapes download (.pkl)

I. Example of how to evaluate the pretrained ImageNet model:

  1. Download the pretrained ImageNet .pkl file. (I recommend using the gdown command!)
  2. Put the model under pretrained_models/ folder with some file name [FILENAME].
  3. Run the MDEQ classification validation command:
python tools/cls_valid.py --testModel pretrained_models/[FILENAME] --cfg experiments/imagenet/cls_mdeq_[SIZE].yaml

For example, for MDEQ-Small, you should get >75% top-1 accuracy.

II. Example of how to use the pretrained ImageNet model to train on Cityscapes:

  1. Download the pretrained ImageNet .pkl file.
  2. Put the model under pretrained_models/ folder with some file name [FILENAME].
  3. In the corresponding experiments/cityscapes/seg_MDEQ_[SIZE].yaml (where SIZE is typically SMALL, LARGE or XL), set MODEL.PRETRAINED to "pretrained_models/[FILENAME]".
  4. Run the MDEQ segmentation training command (see the "Usage" section above):
python -m torch.distributed.launch --nproc_per_node=[N_GPUS] tools/seg_train.py --cfg experiments/cityscapes/seg_MDEQ_[SIZE].yaml

III. Example of how to use the pretrained Cityscapes model for inference:

  1. Download the pretrained Cityscapes .pkl file
  2. Put the model under pretrained_models/ folder with some file name [FILENAME].
  3. In the corresponding experiments/cityscapes/seg_MDEQ_[SIZE].yaml (where SIZE is typically SMALL, LARGE or XL), set TEST.MODEL_FILE to "pretrained_models/[FILENAME]".
  4. Run the MDEQ segmentation testing command (see the "Usage" section above):
python tools/seg_test.py --cfg experiments/cityscapes/seg_MDEQ_[SIZE].yaml

Tips:

  • To load the Cityscapes pretrained model, download the .pkl file and specify the path in config.[TRAIN/TEST].MODEL_FILE (which is '' by default) in the .yaml files. This is different from setting MODEL.PRETRAINED, see the point below.
  • The difference between [TRAIN/TEST].MODEL_FILE and MODEL.PRETRAINED arguments in the yaml files: the former is used to load all of the model parameters; the latter is for compound training (e.g., when transferring from ImageNet to Cityscapes, we want to discard the final classifier FC layers).
  • The repo supports checkpointing of models at each epoch. One can resume from a previously saved checkpoint by turning on the TRAIN.RESUME argument in the yaml files.
  • Just like DEQs, the MDEQ models can be slower than explicit deep networks, and even more so as the image size increases (because larger images typically require more Broyden iterations to converge well; see Figure 5 in the paper). But one can play with the forward and backward thresholds to adjust the runtime.

Acknowledgement

Some utilization code (e.g., model summary and yaml processing) of this repo were modified from the HRNet repo and the DEQ repo.

Owner
CMU Locus Lab
Zico Kolter's Research Group
CMU Locus Lab
Super-Fast-Adversarial-Training - A PyTorch Implementation code for developing super fast adversarial training

Super-Fast-Adversarial-Training This is a PyTorch Implementation code for develo

LBK 26 Dec 02, 2022
Rule Based Classification Project For Python

Rule-Based-Classification-Project (ENG) Business Problem: A game company wants to create new level-based customer definitions (personas) by using some

Deniz Can OÄžUZ 4 Oct 29, 2022
Yolo algorithm for detection + centroid tracker to track vehicles

Vehicle Tracking using Centroid tracker Algorithm used : Yolo algorithm for detection + centroid tracker to track vehicles Backend : opencv and python

6 Dec 21, 2022
Meta Language-Specific Layers in Multilingual Language Models

Meta Language-Specific Layers in Multilingual Language Models This repo contains the source codes for our paper On Negative Interference in Multilingu

Zirui Wang 20 Feb 13, 2022
Author: Wenhao Yu ([email protected]). ACL 2022. Commonsense Reasoning on Knowledge Graph for Text Generation

Diversifying Commonsense Reasoning Generation on Knowledge Graph Introduction -- This is the pytorch implementation of our ACL 2022 paper "Diversifyin

DM2 Lab @ ND 61 Dec 30, 2022
Personalized Transfer of User Preferences for Cross-domain Recommendation (PTUPCDR)

This is the official implementation of our paper Personalized Transfer of User Preferences for Cross-domain Recommendation (PTUPCDR), which has been accepted by WSDM2022.

Yongchun Zhu 81 Dec 29, 2022
An implementation on "Curved-Voxel Clustering for Accurate Segmentation of 3D LiDAR Point Clouds with Real-Time Performance"

Lidar-Segementation An implementation on "Curved-Voxel Clustering for Accurate Segmentation of 3D LiDAR Point Clouds with Real-Time Performance" from

Wangxu1996 135 Jan 06, 2023
This is the workbook I created while I was studying for the Qiskit Associate Developer exam. I hope this becomes useful to others as it was for me :)

A Workbook for the Qiskit Developer Certification Exam Hello everyone! This is Bartu, a fellow Qiskitter. I have recently taken the Certification exam

Bartu Bisgin 66 Dec 10, 2022
Pytorch implementation of BRECQ, ICLR 2021

BRECQ Pytorch implementation of BRECQ, ICLR 2021 @inproceedings{ li&gong2021brecq, title={BRECQ: Pushing the Limit of Post-Training Quantization by Bl

Yuhang Li 148 Dec 28, 2022
null

DeformingThings4D dataset Video | Paper DeformingThings4D is an synthetic dataset containing 1,972 animation sequences spanning 31 categories of human

208 Jan 03, 2023
A machine learning benchmark of in-the-wild distribution shifts, with data loaders, evaluators, and default models.

WILDS is a benchmark of in-the-wild distribution shifts spanning diverse data modalities and applications, from tumor identification to wildlife monitoring to poverty mapping.

P-Lambda 437 Dec 30, 2022
Spline is a tool that is capable of running locally as well as part of well known pipelines like Jenkins (Jenkinsfile), Travis CI (.travis.yml) or similar ones.

Welcome to spline - the pipeline tool Important note: Since change in my job I didn't had the chance to continue on this project. My main new project

Thomas Lehmann 29 Aug 22, 2022
Predicting a person's gender based on their weight and height

Logistic Regression Advanced Case Study Gender Classification: Predicting a person's gender based on their weight and height 1. Introduction We turn o

1 Feb 01, 2022
Joint Versus Independent Multiview Hashing for Cross-View Retrieval[J] (IEEE TCYB 2021, PyTorch Code)

Thanks to the low storage cost and high query speed, cross-view hashing (CVH) has been successfully used for similarity search in multimedia retrieval. However, most existing CVH methods use all view

4 Nov 19, 2022
The source code for Adaptive Kernel Graph Neural Network at AAAI2022

AKGNN The source code for Adaptive Kernel Graph Neural Network at AAAI2022. Please cite our paper if you think our work is helpful to you: @inproceedi

11 Nov 25, 2022
SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained Data (AAAI 2021)

SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained Data (AAAI 2021) PyTorch implementation of SnapMix | paper Method Overview Cite

DavidHuang 126 Dec 30, 2022
Code base for "On-the-Fly Test-time Adaptation for Medical Image Segmentation"

On-the-Fly Adaptation Official Pytorch Code base for On-the-Fly Test-time Adaptation for Medical Image Segmentation Paper Introduction One major probl

Jeya Maria Jose 17 Nov 10, 2022
This implementation contains the application of GPlearn's symbolic transformer on a commodity futures sector of the financial market.

GPlearn_finiance_stock_futures_extension This implementation contains the application of GPlearn's symbolic transformer on a commodity futures sector

Chengwei <a href=[email protected]"> 189 Dec 25, 2022
A GPT, made only of MLPs, in Jax

MLP GPT - Jax (wip) A GPT, made only of MLPs, in Jax. The specific MLP to be used are gMLPs with the Spatial Gating Units. Working Pytorch implementat

Phil Wang 53 Sep 27, 2022