DeepOBS: A Deep Learning Optimizer Benchmark Suite

Last update: May 12, 2020

Related tags

Overview

DeepOBS - A Deep Learning Optimizer Benchmark Suite

DeepOBS is a benchmarking suite that drastically simplifies, automates and improves the evaluation of deep learning optimizers.

It can evaluate the performance of new optimizers on a variety of real-world test problems and automatically compare them with realistic baselines.

DeepOBS automates several steps when benchmarking deep learning optimizers:

Downloading and preparing data sets.
Setting up test problems consisting of contemporary data sets and realistic deep learning architectures.
Running the optimizers on multiple test problems and logging relevant metrics.
Reporting and visualization the results of the optimizer benchmark.

This branch contains the beta of version 1.2.0 with TensorFlow and PyTorch support. It is currently in a pre-release state. Not all features are implemented and most notably we currently don't provide baselines for this version.

The full documentation of this beta version is available on readthedocs: https://deepobs-with-pytorch.readthedocs.io/

The paper describing DeepOBS has been accepted for ICLR 2019 and can be found here: https://openreview.net/forum?id=rJg6ssC5Y7

If you find any bugs in DeepOBS, or find it hard to use, please let us know. We are always interested in feedback and ways to improve DeepOBS.

Installation

pip install -e git+https://github.com/fsschneider/[email protected]#egg=DeepOBS

We tested the package with Python 3.6, TensorFlow version 1.12, Torch version 1.1.0 and Torchvision version 0.3.0. Other versions might work, and we plan to expand compatibility in the future.

Further tutorials and a suggested protocol for benchmarking deep learning optimizers can be found on https://deepobs-with-pytorch.readthedocs.io/

Comments

Request: Share the hyper-parameters found in the grid search

To lessen the burden of re-running the benchmark, would it be possible to publish the optimal hyper-parameters somewhere?

By-reusing those hyper-parameters, one would avoid the most computationally-demanding part of reproducing the results (by 1-2 orders of magnitude).

opened by jotaf98 2
Add functionality to skip existing runs, plotting modes, some refactoring
Adding parameter skip_if_exists to runner.run

Default value is set such that the current behavior is maintained

By setting to True, runs that already have a .json output file will not be executed again

Possible extensions

Make skip_if_exists arg-parsable
opened by f-dangel 2

KeyError: 'optimizer_hyperparams'

(Apologies for creating multiple issues in a row -- it seemed more clean to keep them separate.)

I downloaded the data from DeepOBS_Baselines, and attempted to run example_analyze_pytorch.py. Unfortunately DeepOBS seems to look for keys in the JSON files that don't exist:

$ python example_analyze_pytorch.py
/users/user/Research/deepobs/deepobs/analyzer/shared_utils.py:144: RuntimeWarning: Metric valid_accu
racies does not exist for testproblem quadratic_deep. We now use fallback metric valid_losses
  default_metric), RuntimeWarning)
/users/user/Research/deepobs/deepobs/analyzer/shared_utils.py:229: RuntimeWarning: All settings for
/scratch/local/ssd/user/data/deepobs/quadratic_deep/SGD on test problem quadratic_deep have the same
 number of seeds runs. Mode 'most' does not make sense and we use the fallback mode 'final'
  .format(optimizer_path, testproblem_name), RuntimeWarning)
{'Performance': 127.96759578159877, 'Speed': 'N.A.', 'Hyperparameters': {'lr': 0.01, 'momentum': 0.9
9, 'nesterov': False}, 'Training Parameters': {}}
/users/user/Research/deepobs/deepobs/analyzer/shared_utils.py:144: RuntimeWarning: Metric valid_accu
racies does not exist for testproblem quadratic_deep. We now use fallback metric valid_losses
  default_metric), RuntimeWarning)
/users/user/Research/deepobs/deepobs/analyzer/shared_utils.py:229: RuntimeWarning: All settings for
/scratch/local/ssd/user/data/deepobs/quadratic_deep/SGD on test problem quadratic_deep have the same
 number of seeds runs. Mode 'most' does not make sense and we use the fallback mode 'final'
  .format(optimizer_path, testproblem_name), RuntimeWarning)
/users/user/Research/deepobs/deepobs/analyzer/shared_utils.py:150: RuntimeWarning: Cannot fallback t
o metric valid_losses for optimizer MomentumOptimizer on testproblem quadratic_deep. Will now fallba
ck to metric test_losses
  testproblem_name), RuntimeWarning)
/users/user/miniconda3/lib/python3.7/site-packages/numpy/core/_methods.py:193: RuntimeWarning: inva$
id value encountered in subtract
  x = asanyarray(arr - arrmean)
/users/user/miniconda3/lib/python3.7/site-packages/numpy/lib/function_base.py:3949: RuntimeWarning:
invalid value encountered in multiply
  x2 = take(ap, indices_above, axis=axis) * weights_above
Traceback (most recent call last):
  File "example_analyze_pytorch.py", line 17, in <module>
    analyzer.plot_optimizer_performance(result_path, reference_path=base + '/deepobs/baselines/quad$
atic_deep/MomentumOptimizer')
  File "/users/user/Research/deepobs/deepobs/analyzer/analyze.py", line 514, in plot_optimizer_perfo
rmance
    which=which)
  File "/users/user/Research/deepobs/deepobs/analyzer/analyze.py", line 462, in _plot_optimizer_perf
ormance
    optimizer_path, mode, metric)
  File "/users/user/Research/deepobs/deepobs/analyzer/shared_utils.py", line 206, in create_setting_
analyzer_ranking
    setting_analyzers = _get_all_setting_analyzer(optimizer_path)
  File "/users/user/Research/deepobs/deepobs/analyzer/shared_utils.py", line 184, in _get_all_settin
g_analyzer
    setting_analyzers.append(SettingAnalyzer(sett_path))
  File "/users/user/Research/deepobs/deepobs/analyzer/shared_utils.py", line 260, in __init__
    self.aggregate = aggregate_runs(path)
  File "/users/user/Research/deepobs/deepobs/analyzer/shared_utils.py", line 101, in aggregate_runs
    aggregate['optimizer_hyperparams'] = json_data['optimizer_hyperparams']
KeyError: 'optimizer_hyperparams'

One of the JSON files in question looks like this (data points snipped for brevity):

{
"train_losses": [353.9337594168527, 347.5994306291853, 331.35902622767856, 307.2468915666853, ... 97.28871154785156, 91.45470428466797, 96.45774841308594, 86.27237701416016],
"optimizer": "MomentumOptimizer",
"testproblem": "quadratic_deep",
"weight_decay": null,
"batch_size": 128,
"num_epochs": 100,
"learning_rate": 1e-05,
"lr_sched_epochs": null,
"lr_sched_factors": null,
"random_seed": 42,
"train_log_interval": 1,
"hyperparams": {"momentum": 0.99, "use_nesterov": false}
}

The obvious key seems to be hyperparams as opposed to optimizer_hyperparams; this occurs only for some JSON files.

Edit: Having fixed this, there is a further key error on training_params. Perhaps these were generated with different versions of the package.

opened by jotaf98 3

Installation error / unmentioned dependency "bayes_opt"

Attempting to install by following the documentation's instructions, after installing all the mentioned dependencies with conda, results in the following error:

(base) [email protected]:~$ pip install -e git+https://github.com/abahde/[email protected]#egg=DeepOBS
Obtaining DeepOBS from git+https://github.com/abahde/[email protected]#egg=DeepOBS
  Cloning https://github.com/abahde/DeepOBS.git (to revision master) to ./src/deepobs
  Running command git clone -q https://github.com/abahde/DeepOBS.git /users/user/src/deepobs
    ERROR: Complete output from command python setup.py egg_info:
    ERROR: Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/users/user/src/deepobs/setup.py", line 5, in <module>
        from deepobs import __version__
      File "/users/user/src/deepobs/deepobs/__init__.py", line 5, in <module>
        from . import analyzer
      File "/users/user/src/deepobs/deepobs/analyzer/__init__.py", line 2, in <module>
        from . import analyze
      File "/users/user/src/deepobs/deepobs/analyzer/analyze.py", line 12, in <module>
        from ..tuner.tuner_utils import generate_tuning_summary
      File "/users/user/src/deepobs/deepobs/tuner/__init__.py", line 4, in <module>
        from .bayesian import GP
      File "/users/user/src/deepobs/deepobs/tuner/bayesian.py", line 3, in <module>
        from bayes_opt import UtilityFunction
    ModuleNotFoundError: No module named 'bayes_opt'
    ----------------------------------------
ERROR: Command "python setup.py egg_info" failed with error code 1 in /users/user/src/deepobs/

Is this bayes_opt package really necessary? It seems a bit tangential to the package's purpose (or at most optional).

Edit: It turns out that bayesian-optimization has relatively few requirements so this is not a big issue; perhaps just the docs need updating.

As an aside, it might be possible to suggest a single conda command that installs everything: conda install -c conda-forge seaborn matplotlib2tikz bayesian-optimization.

opened by jotaf98 0

Wall-clock time plots

Optimizers can have very different runtimes per iteration, especially 2nd-order ones.

This means that sometimes, despite promises of "faster" convergence, the wall-clock time taken to converge is disappointingly larger.

Is there any chance DeepOBS could implement wall-clock time plots, in addition to per-epoch ones? (E.g. X axis in minutes or hours.)

opened by jotaf98 4
Improve estimate_runtime()
There are a couple of improvements that I suggest:

[ ] Return the results not as a string, but as a dict or an object.

[ ] (Maybe, think about that) Include the ability to test multiple optimizers simultaneously.

[ ] Report standard deviation and individual runtimes for SGD.

[ ] Add a function that generates a figure, similar to https://github.com/ludwigbald/probprec/blob/master/code/exp_perf_prec/analyze.py
opened by ludwigbald 0
Implement validation set split also for TensorFlow

In PyTorch we split the validation set from the training set randomly. It has the size of the test set. The validation performance is used by the tuner and analyzer to obtain the best instance. This split should be implemented in the TensorFlow data sets as well. We have already prepared the test problem and the runner implementations for this change. The only change that needs to be done to the runner is marked in the code with a ToDo flag.
bug enhancement

opened by abahde 0

Releases(v1.2.0-beta)

v1.2.0-beta(Sep 17, 2019)
Draft of release notes:

A PyTorch implementation (though not for all test problems yet)

A refactored Analyzer module (more flexibility and interpretability)

A Tuning module that automates the tuning process

Some minor improvements of the TensorFlow code (important bugfix: fmnist_mlp now really uses F-MNIST and not MNIST)

For the PyTorch code a validation set metric for each test problem. However, so far, the TensorFlow code comes without validation sets.

Runners now break from training if the loss becomes NaN.

Runners now return the output dictionary.

Additional training parameters can be passed as kwargs to the run() method.

Numpy is now also seeded.

Small and large benchmark sets are now global variables in DeepOBS.

Default test problem settings are now a global variable in DeepOBS.

JSON output is now dumped in human readable format.

Accuracy is now only printed if available.

Simplified Runner API.

Learning Rate Schedule Runner is now an extra class.

Source code(tar.gz)
Source code(zip)

Owner

Aaron Bahde

Graduate student at the University of Tübingen, Methods of Machine Learning

GitHub Repository

This is the official code of L2G, Unrolling and Recurrent Unrolling in Learning to Learn Graph Topologies.

Learning to Learn Graph Topologies This is the official code of L2G, Unrolling and Recurrent Unrolling in Learning to Learn Graph Topologies. Requirem

16 Dec 09, 2022

LexGLUE: A Benchmark Dataset for Legal Language Understanding in English

LexGLUE: A Benchmark Dataset for Legal Language Understanding in English ⚖️ 🏆 🧑‍🎓 👩‍⚖️ Dataset Summary Inspired by the recent widespread use of th

95 Dec 08, 2022

The official re-implementation of the Neurips 2021 paper, "Targeted Neural Dynamical Modeling".

Targeted Neural Dynamical Modeling Note: This is a re-implementation (in Tensorflow2) of the original TNDM model. We do not plan to further update the

6 Oct 05, 2022

Watch faces morph into each other with StyleGAN 2, StyleGAN, and DCGAN!

FaceMorpher FaceMorpher is an innovative project to get a unique face morph (or interpolation for geeks) on a website. Yes, this means you can see fac

9 Jun 24, 2022

ROS Basics and TurtleSim

Waypoint Follower Anna Garverick This package draws given waypoints, then waits for a service call with a start position to send the turtle to each wa

1 Dec 13, 2021

This repository attempts to replicate the SqueezeNet architecture and implement the same on an image classification task.

SqueezeNet-Implementation This repository attempts to replicate the SqueezeNet architecture using TensorFlow discussed in the research paper: "Squeeze

3 Dec 13, 2022

PyTorch implementation of our ICCV 2019 paper: Liquid Warping GAN: A Unified Framework for Human Motion Imitation, Appearance Transfer and Novel View Synthesis

Impersonator PyTorch implementation of our ICCV 2019 paper: Liquid Warping GAN: A Unified Framework for Human Motion Imitation, Appearance Transfer an

1.7k Jan 06, 2023

PCGNN - Procedural Content Generation with NEAT and Novelty

PCGNN - Procedural Content Generation with NEAT and Novelty Generation Approach — Metrics — Paper — Poster — Examples PCGNN - Procedural Content Gener

8 Dec 10, 2022

Code for "Human Pose Regression with Residual Log-likelihood Estimation", ICCV 2021 Oral

Human Pose Regression with Residual Log-likelihood Estimation [Paper] [arXiv] [Project Page] Human Pose Regression with Residual Log-likelihood Estima

347 Dec 24, 2022

AAAI 2022: Stationary diffusion state neural estimation

Stationary Diffusion State Neural Estimation Although many graph-based clustering methods attempt to model the stationary diffusion state in their obj

33 Nov 24, 2022

Machine Learning Time-Series Platform

cesium: Open-Source Platform for Time Series Inference Summary cesium is an open source library that allows users to: extract features from raw time s

632 Dec 26, 2022

This repo provides code for QB-Norm (Cross Modal Retrieval with Querybank Normalisation)

This repo provides code for QB-Norm (Cross Modal Retrieval with Querybank Normalisation) Usage example python dynamic_inverted_softmax.py --sims_train

36 Dec 29, 2022

The official implementation of the IEEE S&P`22 paper "SoK: How Robust is Deep Neural Network Image Classification Watermarking".

Watermark-Robustness-Toolbox - Official PyTorch Implementation This repository contains the official PyTorch implementation of the following paper to

49 Dec 19, 2022

DeepOBS: A Deep Learning Optimizer Benchmark Suite

Related tags

Overview

DeepOBS - A Deep Learning Optimizer Benchmark Suite

Installation

Comments

Request: Share the hyper-parameters found in the grid search

Add functionality to skip existing runs, plotting modes, some refactoring

KeyError: 'optimizer_hyperparams'

Installation error / unmentioned dependency "bayes_opt"

Wall-clock time plots

Improve estimate_runtime()

Implement validation set split also for TensorFlow

Releases(v1.2.0-beta)

v1.2.0-beta(Sep 17, 2019)

Owner

Aaron Bahde

This is the official code of L2G, Unrolling and Recurrent Unrolling in Learning to Learn Graph Topologies.

LexGLUE: A Benchmark Dataset for Legal Language Understanding in English

The official re-implementation of the Neurips 2021 paper, "Targeted Neural Dynamical Modeling".

Watch faces morph into each other with StyleGAN 2, StyleGAN, and DCGAN!

ROS Basics and TurtleSim

This repository attempts to replicate the SqueezeNet architecture and implement the same on an image classification task.

PyTorch implementation of our ICCV 2019 paper: Liquid Warping GAN: A Unified Framework for Human Motion Imitation, Appearance Transfer and Novel View Synthesis

PCGNN - Procedural Content Generation with NEAT and Novelty

Code for "Human Pose Regression with Residual Log-likelihood Estimation", ICCV 2021 Oral

AAAI 2022: Stationary diffusion state neural estimation

Machine Learning Time-Series Platform

This repo provides code for QB-Norm (Cross Modal Retrieval with Querybank Normalisation)

The official implementation of the IEEE S&P`22 paper "SoK: How Robust is Deep Neural Network Image Classification Watermarking".

Project for tracking occupancy in Tel-Aviv parking lots.

A minimalist implementation of score-based diffusion model

(ICCV 2021) PyTorch implementation of Paper "Progressive Correspondence Pruning by Consensus Learning"

Apply AnimeGAN-v2 across frames of a video clip

Have you ever wondered how cool it would be to have your own A.I

MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble

This repository contains all the code and materials distributed in the 2021 Q-Programming Summer of Qode.