Supplementary code for the paper "Meta-Solver for Neural Ordinary Differential Equations" https://arxiv.org/abs/2103.08561

Overview

Meta-Solver for Neural Ordinary Differential Equations

Towards robust neural ODEs using parametrized solvers.

Main idea

Each Runge-Kutta (RK) solver with s stages and of the p-th order is defined by a table of coefficients (Butcher tableau). For s=p=2, s=p=3 and s=p=4 all coefficient in the table can be parametrized with no more than two variables [1].

Usually, during neural ODE training RK solver with fixed Butcher tableau is used, and only the right-hand side (RHS) function is trained. We propose to use the whole parametric family of RK solvers to improve robustness of neural ODEs.

Requirements

  • pytorch==1.7
  • apex==0.1 (for training)

Examples

For CIFAR-10 and MNIST demo, please, check examples folder.

Meta Solver Regimes

In the notebook examples/cifar10/Evaluate model.ipynb we show how to perform the forward pass through the Neural ODE using different types of Meta Solver regimes, namely

  • Standalone
  • Solver switching/smoothing
  • Solver ensembling
  • Model ensembling

In more details, usage of different regimes means

  • Standalone

    • Use one solver during inference.
    • This regime is applied in the training and testing stages.
  • Solver switching / smoothing

    • For each batch one solver is chosen from a group of solvers with finite (in switching regime) or infinite (in smoothing regime) number of candidates.
    • This regime is applied in the training stage
  • Solver ensembling

    • Use several solvers durung inference.
    • Outputs of ODE Block (obtained with different solvers) are averaged before propagating through the next layer.
    • This regime is applied in the training and testing stages.
  • Model ensembling

    • Use several solvers durung inference.
    • Model probabilites obtained via propagation with different solvers are averaged to get the final result.
    • This regime is applied in the training and testing stages.

Selected results

Different solver parameterizations yield different robustness

We have trained a neural ODE model several times, using different u values in parametrization of the 2-nd order Runge-Kutta solver. The image below depicts robust accuracies for the MNIST classification task. We use PGD attack (eps=0.3, lr=2/255 and iters=7). The mean values of robust accuracy (bold lines) and +- standard error mean (shaded region) computed across 9 random seeds are shown in this image.

Solver smoothing improves robustness

We compare results of neural ODE adversarial training on CIFAR-10 dataset with and without solver smoothing (using normal distribution with mean = 0 and sigma=0.0125). We choose 8-steps RK2 solver with u=0.5 for this experiment.

  • We perform training using FGSM random technique described in https://arxiv.org/abs/2001.03994 (with eps=8/255, alpha=10/255).
  • We use cyclic learning rate schedule with one cycle (36 epochs, max_lr=0.1, base_lr=1e-7).
  • We measure robust accuracy of resulting models after FGSM (eps=8/255) and PGD (eps=8/255, lr=2/255, iters=7) attacks.
  • We use premetanode10 architecture from sopa/src/models/odenet_cifar10/layers.py that has the following form Conv -> PreResNet block -> ODE block -> PreResNet block -> ODE block -> GeLU -> Average Pooling -> Fully Connected
  • We compute mean and standard error across 3 random seeds.

References

[1] Wanner, G., & Hairer, E. (1993). Solving ordinary differential equations I. Springer Berlin Heidelberg

Owner
Julia Gusak
Julia Gusak
PyTorch implementation of D2C: Diffuison-Decoding Models for Few-shot Conditional Generation.

D2C: Diffuison-Decoding Models for Few-shot Conditional Generation Project | Paper PyTorch implementation of D2C: Diffuison-Decoding Models for Few-sh

Jiaming Song 90 Dec 27, 2022
Jax/Flax implementation of Variational-DiffWave.

jax-variational-diffwave Jax/Flax implementation of Variational-DiffWave. (Zhifeng Kong et al., 2020, Diederik P. Kingma et al., 2021.) DiffWave with

YoungJoong Kim 37 Dec 16, 2022
A Python library for working with arbitrary-dimension hypercomplex numbers following the Cayley-Dickson construction of algebras.

Hypercomplex A Python library for working with quaternions, octonions, sedenions, and beyond following the Cayley-Dickson construction of hypercomplex

7 Nov 04, 2022
Blender scripts for computing geodesic distance

GeoDoodle Geodesic distance computation for Blender meshes Table of Contents Overivew Usage Implementation Overview This addon provides an operator fo

20 Jun 08, 2022
Improved Fitness Optimization Landscapes for Sequence Design

ReLSO Improved Fitness Optimization Landscapes for Sequence Design Description Citation How to run Training models Original data source Description In

Krishnaswamy Lab 44 Dec 20, 2022
Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation

Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation Requirements This repository needs mmsegmentation Training To train

Adelaide Intelligent Machines (AIM) Group 7 Sep 12, 2022
PyQt6 configuration in yaml format providing the most simple script.

PyamlQt(ぴゃむるきゅーと) PyQt6 configuration in yaml format providing the most simple script. Requirements yaml PyQt6, ( PyQt5 ) Installation pip install Pya

Ar-Ray 7 Aug 15, 2022
Pytorch implementation of Learning Rate Dropout.

Learning-Rate-Dropout Pytorch implementation of Learning Rate Dropout. Paper Link: https://arxiv.org/pdf/1912.00144.pdf Train ResNet-34 for Cifar10: r

42 Nov 25, 2022
A program that can analyze videos according to the weights you select

MaskMonitor A program that can analyze videos according to the weights you select 下載 訓練完的 weight檔案 執行 MaskDetection.py 內部可更改 輸入來源(鏡頭, 影片, 圖片) 以及輸出條件(人

Patrick_star 1 Nov 07, 2021
TCube generates rich and fluent narratives that describes the characteristics, trends, and anomalies of any time-series data (domain-agnostic) using the transfer learning capabilities of PLMs.

TCube: Domain-Agnostic Neural Time series Narration This repository contains the code for the paper: "TCube: Domain-Agnostic Neural Time series Narrat

Mandar Sharma 7 Oct 31, 2021
Official Matlab Implementation for "Tiny Obstacle Discovery by Occlusion-aware Multilayer Regression", TIP 2020

Tiny Obstacle Discovery by Occlusion-aware Multilayer Regression Official Matlab Implementation for "Tiny Obstacle Discovery by Occlusion-aware Multil

Xuefeng 5 Jan 15, 2022
A way to store images in YAML.

YAMLImg A way to store images in YAML. I made this after seeing Roadcrosser's JSON-G because it was too inspiring to ignore this opportunity. Installa

5 Mar 14, 2022
Implementation of the "PSTNet: Point Spatio-Temporal Convolution on Point Cloud Sequences" paper.

PSTNet: Point Spatio-Temporal Convolution on Point Cloud Sequences Introduction Point cloud sequences are irregular and unordered in the spatial dimen

Hehe Fan 63 Dec 09, 2022
Code for the ACL2021 paper "Lexicon Enhanced Chinese Sequence Labelling Using BERT Adapter"

Lexicon Enhanced Chinese Sequence Labeling Using BERT Adapter Code and checkpoints for the ACL2021 paper "Lexicon Enhanced Chinese Sequence Labelling

274 Dec 06, 2022
CondenseNet: Light weighted CNN for mobile devices

CondenseNets This repository contains the code (in PyTorch) for "CondenseNet: An Efficient DenseNet using Learned Group Convolutions" paper by Gao Hua

Shichen Liu 690 Nov 30, 2022
The official implementation of A Unified Game-Theoretic Interpretation of Adversarial Robustness.

This repository is the official implementation of A Unified Game-Theoretic Interpretation of Adversarial Robustness. Requirements pip install -r requi

Jie Ren 17 Dec 12, 2022
Cascading Feature Extraction for Fast Point Cloud Registration (BMVC 2021)

Cascading Feature Extraction for Fast Point Cloud Registration This repository contains the source code for the paper [Arxive link comming soon]. Meth

7 May 26, 2022
JstDoS - HTTP Protocol Stack Remote Code Execution Vulnerability

jstDoS If you are going to skid that, please give credits ! ^^ ¿How works? This

apolo 4 Feb 11, 2022
The easiest tool for extracting radiomics features and training ML models on them.

Simple pipeline for experimenting with radiomics features Installation git clone https://github.com/piotrekwoznicki/ClassyRadiomics.git cd classrad pi

Piotr Woźnicki 17 Aug 04, 2022
End-to-end Temporal Action Detection with Transformer. [Under review]

TadTR: End-to-end Temporal Action Detection with Transformer By Xiaolong Liu, Qimeng Wang, Yao Hu, Xu Tang, Song Bai, Xiang Bai. This repo holds the c

Xiaolong Liu 105 Dec 25, 2022