DeepOBS: A Deep Learning Optimizer Benchmark Suite

Last update: May 12, 2020

Related tags

Overview

DeepOBS - A Deep Learning Optimizer Benchmark Suite

DeepOBS is a benchmarking suite that drastically simplifies, automates and improves the evaluation of deep learning optimizers.

It can evaluate the performance of new optimizers on a variety of real-world test problems and automatically compare them with realistic baselines.

DeepOBS automates several steps when benchmarking deep learning optimizers:

Downloading and preparing data sets.
Setting up test problems consisting of contemporary data sets and realistic deep learning architectures.
Running the optimizers on multiple test problems and logging relevant metrics.
Reporting and visualization the results of the optimizer benchmark.

This branch contains the beta of version 1.2.0 with TensorFlow and PyTorch support. It is currently in a pre-release state. Not all features are implemented and most notably we currently don't provide baselines for this version.

The full documentation of this beta version is available on readthedocs: https://deepobs-with-pytorch.readthedocs.io/

The paper describing DeepOBS has been accepted for ICLR 2019 and can be found here: https://openreview.net/forum?id=rJg6ssC5Y7

If you find any bugs in DeepOBS, or find it hard to use, please let us know. We are always interested in feedback and ways to improve DeepOBS.

Installation

pip install -e git+https://github.com/fsschneider/[email protected]#egg=DeepOBS

We tested the package with Python 3.6, TensorFlow version 1.12, Torch version 1.1.0 and Torchvision version 0.3.0. Other versions might work, and we plan to expand compatibility in the future.

Further tutorials and a suggested protocol for benchmarking deep learning optimizers can be found on https://deepobs-with-pytorch.readthedocs.io/

Comments

Request: Share the hyper-parameters found in the grid search

To lessen the burden of re-running the benchmark, would it be possible to publish the optimal hyper-parameters somewhere?

By-reusing those hyper-parameters, one would avoid the most computationally-demanding part of reproducing the results (by 1-2 orders of magnitude).

opened by jotaf98 2
Add functionality to skip existing runs, plotting modes, some refactoring
Adding parameter skip_if_exists to runner.run

Default value is set such that the current behavior is maintained

By setting to True, runs that already have a .json output file will not be executed again

Possible extensions

Make skip_if_exists arg-parsable
opened by f-dangel 2

KeyError: 'optimizer_hyperparams'

(Apologies for creating multiple issues in a row -- it seemed more clean to keep them separate.)

I downloaded the data from DeepOBS_Baselines, and attempted to run example_analyze_pytorch.py. Unfortunately DeepOBS seems to look for keys in the JSON files that don't exist:

$ python example_analyze_pytorch.py
/users/user/Research/deepobs/deepobs/analyzer/shared_utils.py:144: RuntimeWarning: Metric valid_accu
racies does not exist for testproblem quadratic_deep. We now use fallback metric valid_losses
  default_metric), RuntimeWarning)
/users/user/Research/deepobs/deepobs/analyzer/shared_utils.py:229: RuntimeWarning: All settings for
/scratch/local/ssd/user/data/deepobs/quadratic_deep/SGD on test problem quadratic_deep have the same
 number of seeds runs. Mode 'most' does not make sense and we use the fallback mode 'final'
  .format(optimizer_path, testproblem_name), RuntimeWarning)
{'Performance': 127.96759578159877, 'Speed': 'N.A.', 'Hyperparameters': {'lr': 0.01, 'momentum': 0.9
9, 'nesterov': False}, 'Training Parameters': {}}
/users/user/Research/deepobs/deepobs/analyzer/shared_utils.py:144: RuntimeWarning: Metric valid_accu
racies does not exist for testproblem quadratic_deep. We now use fallback metric valid_losses
  default_metric), RuntimeWarning)
/users/user/Research/deepobs/deepobs/analyzer/shared_utils.py:229: RuntimeWarning: All settings for
/scratch/local/ssd/user/data/deepobs/quadratic_deep/SGD on test problem quadratic_deep have the same
 number of seeds runs. Mode 'most' does not make sense and we use the fallback mode 'final'
  .format(optimizer_path, testproblem_name), RuntimeWarning)
/users/user/Research/deepobs/deepobs/analyzer/shared_utils.py:150: RuntimeWarning: Cannot fallback t
o metric valid_losses for optimizer MomentumOptimizer on testproblem quadratic_deep. Will now fallba
ck to metric test_losses
  testproblem_name), RuntimeWarning)
/users/user/miniconda3/lib/python3.7/site-packages/numpy/core/_methods.py:193: RuntimeWarning: inva$
id value encountered in subtract
  x = asanyarray(arr - arrmean)
/users/user/miniconda3/lib/python3.7/site-packages/numpy/lib/function_base.py:3949: RuntimeWarning:
invalid value encountered in multiply
  x2 = take(ap, indices_above, axis=axis) * weights_above
Traceback (most recent call last):
  File "example_analyze_pytorch.py", line 17, in <module>
    analyzer.plot_optimizer_performance(result_path, reference_path=base + '/deepobs/baselines/quad$
atic_deep/MomentumOptimizer')
  File "/users/user/Research/deepobs/deepobs/analyzer/analyze.py", line 514, in plot_optimizer_perfo
rmance
    which=which)
  File "/users/user/Research/deepobs/deepobs/analyzer/analyze.py", line 462, in _plot_optimizer_perf
ormance
    optimizer_path, mode, metric)
  File "/users/user/Research/deepobs/deepobs/analyzer/shared_utils.py", line 206, in create_setting_
analyzer_ranking
    setting_analyzers = _get_all_setting_analyzer(optimizer_path)
  File "/users/user/Research/deepobs/deepobs/analyzer/shared_utils.py", line 184, in _get_all_settin
g_analyzer
    setting_analyzers.append(SettingAnalyzer(sett_path))
  File "/users/user/Research/deepobs/deepobs/analyzer/shared_utils.py", line 260, in __init__
    self.aggregate = aggregate_runs(path)
  File "/users/user/Research/deepobs/deepobs/analyzer/shared_utils.py", line 101, in aggregate_runs
    aggregate['optimizer_hyperparams'] = json_data['optimizer_hyperparams']
KeyError: 'optimizer_hyperparams'

One of the JSON files in question looks like this (data points snipped for brevity):

{
"train_losses": [353.9337594168527, 347.5994306291853, 331.35902622767856, 307.2468915666853, ... 97.28871154785156, 91.45470428466797, 96.45774841308594, 86.27237701416016],
"optimizer": "MomentumOptimizer",
"testproblem": "quadratic_deep",
"weight_decay": null,
"batch_size": 128,
"num_epochs": 100,
"learning_rate": 1e-05,
"lr_sched_epochs": null,
"lr_sched_factors": null,
"random_seed": 42,
"train_log_interval": 1,
"hyperparams": {"momentum": 0.99, "use_nesterov": false}
}

The obvious key seems to be hyperparams as opposed to optimizer_hyperparams; this occurs only for some JSON files.

Edit: Having fixed this, there is a further key error on training_params. Perhaps these were generated with different versions of the package.

opened by jotaf98 3

Installation error / unmentioned dependency "bayes_opt"

Attempting to install by following the documentation's instructions, after installing all the mentioned dependencies with conda, results in the following error:

(base) [email protected]:~$ pip install -e git+https://github.com/abahde/[email protected]#egg=DeepOBS
Obtaining DeepOBS from git+https://github.com/abahde/[email protected]#egg=DeepOBS
  Cloning https://github.com/abahde/DeepOBS.git (to revision master) to ./src/deepobs
  Running command git clone -q https://github.com/abahde/DeepOBS.git /users/user/src/deepobs
    ERROR: Complete output from command python setup.py egg_info:
    ERROR: Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/users/user/src/deepobs/setup.py", line 5, in <module>
        from deepobs import __version__
      File "/users/user/src/deepobs/deepobs/__init__.py", line 5, in <module>
        from . import analyzer
      File "/users/user/src/deepobs/deepobs/analyzer/__init__.py", line 2, in <module>
        from . import analyze
      File "/users/user/src/deepobs/deepobs/analyzer/analyze.py", line 12, in <module>
        from ..tuner.tuner_utils import generate_tuning_summary
      File "/users/user/src/deepobs/deepobs/tuner/__init__.py", line 4, in <module>
        from .bayesian import GP
      File "/users/user/src/deepobs/deepobs/tuner/bayesian.py", line 3, in <module>
        from bayes_opt import UtilityFunction
    ModuleNotFoundError: No module named 'bayes_opt'
    ----------------------------------------
ERROR: Command "python setup.py egg_info" failed with error code 1 in /users/user/src/deepobs/

Is this bayes_opt package really necessary? It seems a bit tangential to the package's purpose (or at most optional).

Edit: It turns out that bayesian-optimization has relatively few requirements so this is not a big issue; perhaps just the docs need updating.

As an aside, it might be possible to suggest a single conda command that installs everything: conda install -c conda-forge seaborn matplotlib2tikz bayesian-optimization.

opened by jotaf98 0

Wall-clock time plots

Optimizers can have very different runtimes per iteration, especially 2nd-order ones.

This means that sometimes, despite promises of "faster" convergence, the wall-clock time taken to converge is disappointingly larger.

Is there any chance DeepOBS could implement wall-clock time plots, in addition to per-epoch ones? (E.g. X axis in minutes or hours.)

opened by jotaf98 4
Improve estimate_runtime()
There are a couple of improvements that I suggest:

[ ] Return the results not as a string, but as a dict or an object.

[ ] (Maybe, think about that) Include the ability to test multiple optimizers simultaneously.

[ ] Report standard deviation and individual runtimes for SGD.

[ ] Add a function that generates a figure, similar to https://github.com/ludwigbald/probprec/blob/master/code/exp_perf_prec/analyze.py
opened by ludwigbald 0
Implement validation set split also for TensorFlow

In PyTorch we split the validation set from the training set randomly. It has the size of the test set. The validation performance is used by the tuner and analyzer to obtain the best instance. This split should be implemented in the TensorFlow data sets as well. We have already prepared the test problem and the runner implementations for this change. The only change that needs to be done to the runner is marked in the code with a ToDo flag.
bug enhancement

opened by abahde 0

Releases(v1.2.0-beta)

v1.2.0-beta(Sep 17, 2019)
Draft of release notes:

A PyTorch implementation (though not for all test problems yet)

A refactored Analyzer module (more flexibility and interpretability)

A Tuning module that automates the tuning process

Some minor improvements of the TensorFlow code (important bugfix: fmnist_mlp now really uses F-MNIST and not MNIST)

For the PyTorch code a validation set metric for each test problem. However, so far, the TensorFlow code comes without validation sets.

Runners now break from training if the loss becomes NaN.

Runners now return the output dictionary.

Additional training parameters can be passed as kwargs to the run() method.

Numpy is now also seeded.

Small and large benchmark sets are now global variables in DeepOBS.

Default test problem settings are now a global variable in DeepOBS.

JSON output is now dumped in human readable format.

Accuracy is now only printed if available.

Simplified Runner API.

Learning Rate Schedule Runner is now an extra class.

Source code(tar.gz)
Source code(zip)

Owner

Aaron Bahde

Graduate student at the University of Tübingen, Methods of Machine Learning

GitHub Repository

DeepOBS: A Deep Learning Optimizer Benchmark Suite

Related tags

Overview

DeepOBS - A Deep Learning Optimizer Benchmark Suite

Installation

Comments

Request: Share the hyper-parameters found in the grid search

Add functionality to skip existing runs, plotting modes, some refactoring

KeyError: 'optimizer_hyperparams'

Installation error / unmentioned dependency "bayes_opt"

Wall-clock time plots

Improve estimate_runtime()

Implement validation set split also for TensorFlow

Releases(v1.2.0-beta)

v1.2.0-beta(Sep 17, 2019)

Owner

Aaron Bahde

Self-training for Few-shot Transfer Across Extreme Task Differences

Code For TDEER: An Efficient Translating Decoding Schema for Joint Extraction of Entities and Relations (EMNLP2021)

LoveDA: A Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic Segmentation

Learning Super-Features for Image Retrieval

Official repository for "PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long Text Generation"

Tensorflow 2 Object Detection API kurulumu, GPU desteği, custom model hazırlama

LexGLUE: A Benchmark Dataset for Legal Language Understanding in English

LSTM-VAE Implementation and Relevant Evaluations

RoadMap and preparation material for Machine Learning and Data Science - From beginner to expert.

Source code of the paper Meta-learning with an Adaptive Task Scheduler.

Pytorch GUI(demo) for iVOS(interactive VOS) and GIS (Guided iVOS)

Multi-Objective Reinforced Active Learning

An implementation of EWC with PyTorch

A gesture recognition system powered by OpenPose, k-nearest neighbours, and local outlier factor.

A TensorFlow implementation of FCN-8s

CNN Based Meta-Learning for Noisy Image Classification and Template Matching

SwinTrack: A Simple and Strong Baseline for Transformer Tracking

OpenABC-D: A Large-Scale Dataset For Machine Learning Guided Integrated Circuit Synthesis

Point-NeRF: Point-based Neural Radiance Fields

Multi Agent Reinforcement Learning for ROS in 2D Simulation Environments