A PyTorch implementation of Sharpness-Aware Minimization for Efficiently Improving Generalization

Last update: Dec 28, 2022

Related tags

Overview

sam.pytorch

A PyTorch implementation of Sharpness-Aware Minimization for Efficiently Improving Generalization ( Foret+2020) Paper, Official implementation .

Requirements

Python>=3.8
PyTorch>=1.7.1

To run the example, you further need

homura by pip install -U homura-core==2020.12.0
chika by pip install -U chika

Example

python cifar10.py [--optim.name {sam,sgd}] [--model {renst20, wrn28_2}] [--optim.rho 0.05]

Results: Test Accuracy (CIFAR-10)

Model	SAM	SGD
ResNet-20	93.5	93.2
WRN28-2	95.8	95.4
ResNeXT29	96.4	95.8

SAM needs double forward passes per each update, thus training with SAM is slower than training with SGD. In case of ResNet-20 training, 80 mins vs 50 mins on my environment. Additional options --use_amp --jit_model may slightly accelerates the training.

Usage

SAMSGD can be used as a drop-in replacement of PyTorch optimizers with closures. Also, it is compatible with lr_scheduler and has state_dict and load_state_dict.

from sam import SAMSGD

optimizer = SAMSGD(model.parameters(), lr=1e-1, rho=0.05)

for input, target in dataset:
    def closure():
        optimizer.zero_grad()
        output = model(input)
        loss = loss_f(output, target)
        loss.backward()
        return loss


    loss = optimizer.step(closure)

Citation

@ARTICLE{2020arXiv201001412F,
    author = {{Foret}, Pierre and {Kleiner}, Ariel and {Mobahi}, Hossein and {Neyshabur}, Behnam},
    title = "{Sharpness-Aware Minimization for Efficiently Improving Generalization}",
    year = 2020,
    eid = {arXiv:2010.01412},
    eprint = {2010.01412},
}

@software{sampytorch
    author = {Ryuichiro Hataya},
    titile = {sam.pytorch},
    url    = {https://github.com/moskomule/sam.pytorch},
    year   = {2020}
}

A PyTorch implementation of Sharpness-Aware Minimization for Efficiently Improving Generalization

Related tags

Overview

sam.pytorch

Requirements

Example

Results: Test Accuracy (CIFAR-10)

Usage

Citation

Owner

Ryuichiro Hataya

Unofficial implementation of Google's FNet: Mixing Tokens with Fourier Transforms

Nerf pl - NeRF (Neural Radiance Fields) and NeRF in the Wild using pytorch-lightning

VACA: Designing Variational Graph Autoencoders for Interventional and Counterfactual Queries

Perform zero-order Hankel Transform for an 1D array (float or real valued).

A python library to build Model Trees with Linear Models at the leaves.

A package for "Procedural Content Generation via Reinforcement Learning" OpenAI Gym interface.

Codes for SIGIR'22 Paper 'On-Device Next-Item Recommendation with Self-Supervised Knowledge Distillation'

[SIGIR22] Official PyTorch implementation for "CORE: Simple and Effective Session-based Recommendation within Consistent Representation Space".

Reinforcement Learning for Portfolio Management

This is the pytorch code for the paper Curious Representation Learning for Embodied Intelligence.

Genetic feature selection module for scikit-learn

Hso-groupie - A pwnable challenge in Real World CTF 4th

PyTorch implementation of the ExORL: Exploratory Data for Offline Reinforcement Learning

An implementation of "Learning human behaviors from motion capture by adversarial imitation"

Code needed to reproduce the examples found in "The Temporal Robustness of Stochastic Signals"

Speech Recognition using DeepSpeech2.

Contextual Attention Network: Transformer Meets U-Net

Tom-the-AI - A compound artificial intelligence software for Linux systems.

EMNLP 2020 - Summarizing Text on Any Aspects

ShuttleNet: Position-aware Fusion of Rally Progress and Player Styles for Stroke Forecasting in Badminton (AAAI 2022)