Code base for the paper "Scalable One-Pass Optimisation of High-Dimensional Weight-Update Hyperparameters by Implicit Differentiation"

Overview

This repository contains code for the paper Scalable One-Pass Optimisation of High-Dimensional Weight-Update Hyperparameters by Implicit Differentiation.

Installation

Our dependencies are fully specified in Pipfile, which can be supplied to pipenv to install the environment. One failsafe approach is to install pipenv in a fresh virtual environment, then run pipenv install in this directory. Note the Pipfile specifies our Python 3.9 development environment; most experiments were run in an identical environment under Python 3.7 instead.

Difficulties with CUDA versions meant we had to manually install PyTorch and Torchvision rather than use pipenv --- the corresponding lines in Pipfile may need adjustment for your use case. Alternatively, use the list of dependencies as a guide to what to install yourself with pip, or use the full dump of our development environment in final_requirements.txt.

Datasets may not be bundled with the repository, but are expected to be found at locations specified in datasets.py, preprocessed into single PyTorch tensors of all the input and output data (generally data/<dataset>/data.pt and data/<dataset>/targets.pt).

Configuration

Training code is controlled with YAML configuration files, as per the examples in configs/. Generally one file is required to specify the dataset, and a second to specify the algorithm, using the obvious naming convention. Brief help text is available on the command line, but the meanings of each option should be reasonably self-explanatory.

For Ours (WD+LR), use the file Ours_LR.yaml; for Ours (WD+LR+M), use the file Ours_LR_Momentum.yaml; for Ours (WD+HDLR+M), use the file Ours_HDLR_Momentum.yaml. For Long/Medium/Full Diff-through-Opt, we provide separate configuration files for the UCI cases and the Fashion-MNIST cases.

We provide two additional helper configurations. Random_Validation.yaml copies Random.yaml, but uses the entire validation set to compute the validation loss at each logging step. This allows for stricter analysis of the best-performing run at particular time steps, for instance while constructing Random (3-batched). Random_Validation_BayesOpt.yaml only forces the use of the entire dataset for the very last validation loss computation, so that Bayesian Optimisation runs can access reliable performance metrics without adversely affecting runtime.

The configurations provided match those necessary to replicate the main experiments in our paper (in Section 4: Experiments). Other trials, such as those in the Appendix, will require these configurations to be modified as we describe in the paper. Note especially that our three short-horizon bias studies all require different modifications to the LongDiffThroughOpt_*.yaml configurations.

Running

Individual runs are commenced by executing train.py and passing the desired configuration files with the -c flag. For example, to run the default Fashion-MNIST experiments using Diff-through-Opt, use:

$ python train.py -c ./configs/fashion_mnist.yaml ./configs/DiffThroughOpt.yaml

Bayesian Optimisation runs are started in a similar way, but with a call to bayesopt.py rather than train.py.

For executing multiple runs in parallel, parallel_exec.py may be useful: modify the main function call at the bottom of the file as required, then call this file instead of train.py at the command line. The number of parallel workers may be specified by num_workers. Any configurations passed at the command line are used as a base, to which modifications may be added by override_generator. The latter should either be a function which generates one override dictionary per call (in which case num_repetitions sets the number of overrides to generate), or a function which returns a generator over configurations (in which case set num_repetitions = None). Each configuration override is run once for each of algorithms, whose configurations are read automatically from the corresponding files and should not be explicitly passed at the command line. Finally, main_function may be used to switch between parallel calls to train.py and bayesopt.py as required.

For blank-slate replications, the most useful override generators will be natural_sgd_generator, which generates a full SGD initialisation in the ranges we use, and iteration_id, which should be used with Bayesian Optimisation runs to name each parallel run using a counter. Other generators may be useful if you wish to supplement existing results with additional algorithms etc.

PennTreebank and CIFAR-10 were executed on clusters running SLURM; the corresponding subfolders contain configuration scripts for these experiments, and submit.sh handles the actual job submission.

Analysis

By default, runs are logged in Tensorboard format to the ./runs directory, where Tensorboard may be used to inspect the results. If desired, a descriptive name can be appended to a particular execution using the -n switch on the command line. Runs can optionally be written to a dedicated subfolder specified with the -g switch, and the base folder for logging can be changed with the -l switch.

If more precise analysis is desired, pass the directory containing the desired results to util.get_tags(), which will return a dictionary of the evolution of each logged scalar in the results. Note that this function uses Tensorboard calls which predate its --load_fast option, so may take tens of minutes to return.

This data dictionary can be passed to one of the more involved plotting routines in figures.py to produce specific plots. The script paper_plots.py generates all the plots we use in our paper, and may be inspected for details of any particular plot.

PrimitiveNet: Primitive Instance Segmentation with Local Primitive Embedding under Adversarial Metric (ICCV 2021)

PrimitiveNet Source code for the paper: Jingwei Huang, Yanfeng Zhang, Mingwei Sun. [PrimitiveNet: Primitive Instance Segmentation with Local Primitive

Jingwei Huang 47 Dec 06, 2022
Benchmark datasets, data loaders, and evaluators for graph machine learning

Overview The Open Graph Benchmark (OGB) is a collection of benchmark datasets, data loaders, and evaluators for graph machine learning. Datasets cover

1.5k Jan 05, 2023
Py-FEAT: Python Facial Expression Analysis Toolbox

Py-FEAT is a suite for facial expressions (FEX) research written in Python. This package includes tools to detect faces, extract emotional facial expressions (e.g., happiness, sadness, anger), facial

Computational Social Affective Neuroscience Laboratory 147 Jan 06, 2023
Tutorial for the PERFECTING FACTORY 5.0 WITH EDGE-POWERED AI workshop

Workshop Advantech Jetson Nano This tutorial has been designed for the PERFECTING FACTORY 5.0 WITH EDGE-POWERED AI workshop in collaboration with Adva

Edge Impulse 18 Nov 22, 2022
PyTorch implementation of the paper Deep Networks from the Principle of Rate Reduction

Deep Networks from the Principle of Rate Reduction This repository is the official PyTorch implementation of the paper Deep Networks from the Principl

459 Dec 27, 2022
ReSSL: Relational Self-Supervised Learning with Weak Augmentation

ReSSL: Relational Self-Supervised Learning with Weak Augmentation This repository contains PyTorch evaluation code, training code and pretrained model

mingkai 45 Oct 25, 2022
Like Dirt-Samples, but cleaned up

Clean-Samples Like Dirt-Samples, but cleaned up, with clear provenance and license info (generally a permissive creative commons licence but check the

TidalCycles 39 Nov 30, 2022
CVPR2022 (Oral) - Rethinking Semantic Segmentation: A Prototype View

Rethinking Semantic Segmentation: A Prototype View Rethinking Semantic Segmentation: A Prototype View, Tianfei Zhou, Wenguan Wang, Ender Konukoglu and

Tianfei Zhou 239 Dec 26, 2022
An off-line judger supporting distributed problem repositories

Thaw δΈ­ζ–‡ | English Thaw is an off-line judger supporting distributed problem repositories. Everyone can use Thaw release problems with license on GitHu

countercurrent_time 2 Jan 09, 2022
High-quality implementations of standard and SOTA methods on a variety of tasks.

Uncertainty Baselines The goal of Uncertainty Baselines is to provide a template for researchers to build on. The baselines can be a starting point fo

Google 1.1k Dec 30, 2022
πŸƒβ€β™€οΈ A curated list about human motion capture, analysis and synthesis.

Awesome Human Motion πŸƒβ€β™€οΈ A curated list about human motion capture, analysis and synthesis. Contents Introduction Human Models Datasets Data Process

Dennis Wittchen 274 Dec 14, 2022
A PyTorch-based open-source framework that provides methods for improving the weakly annotated data and allows researchers to efficiently develop and compare their own methods.

Knodle (Knowledge-supervised Deep Learning Framework) - a new framework for weak supervision with neural networks. It provides a modularization for se

93 Nov 06, 2022
OOD Generalization and Detection (ACL 2020)

Pretrained Transformers Improve Out-of-Distribution Robustness How does pretraining affect out-of-distribution robustness? We create an OOD benchmark

littleRound 57 Jan 09, 2023
Improving Non-autoregressive Generation with Mixup Training

MIST Training MIST TRAIN_FILE=/your/path/to/train.json VALID_FILE=/your/path/to/valid.json OUTPUT_DIR=/your/path/to/save_checkpoints CACHE_DIR=/your/p

7 Nov 22, 2022
[ICML 2020] Prediction-Guided Multi-Objective Reinforcement Learning for Continuous Robot Control

PG-MORL This repository contains the implementation for the paper Prediction-Guided Multi-Objective Reinforcement Learning for Continuous Robot Contro

MIT Graphics Group 65 Jan 07, 2023
Perspective: Julia for Biologists

Perspective: Julia for Biologists 1. Examples Speed: Example 1 - Single cell data and network inference Domain: Single cell data Methodology: Network

Elisabeth Roesch 55 Dec 02, 2022
Article Reranking by Memory-enhanced Key Sentence Matching for Detecting Previously Fact-checked Claims.

MTM This is the official repository of the paper: Article Reranking by Memory-enhanced Key Sentence Matching for Detecting Previously Fact-checked Cla

ICTMCG 13 Sep 17, 2022
Rule Based Classification Project For Python

Rule-Based-Classification-Project (ENG) Business Problem: A game company wants to create new level-based customer definitions (personas) by using some

Deniz Can OĞUZ 4 Oct 29, 2022
Neural-fractal - Create Fractals Using Complex-Valued Neural Networks!

Neural Fractal Create Fractals Using Complex-Valued Neural Networks! Home Page Features Define Dynamical Systems Using Complex-Valued Neural Networks

Amirabbas Asadi 10 Dec 17, 2022
Top #1 Submission code for the first https://alphamev.ai MEV competition with best AUC (0.9893) and MSE (0.0982).

alphamev-winning-submission Top #1 Submission code for the first alphamev MEV competition with best AUC (0.9893) and MSE (0.0982). The code won't run

70 Oct 29, 2022