A PyTorch implementation of Sharpness-Aware Minimization for Efficiently Improving Generalization

Last update: Dec 28, 2022

Related tags

Overview

sam.pytorch

A PyTorch implementation of Sharpness-Aware Minimization for Efficiently Improving Generalization ( Foret+2020) Paper, Official implementation .

Requirements

Python>=3.8
PyTorch>=1.7.1

To run the example, you further need

homura by pip install -U homura-core==2020.12.0
chika by pip install -U chika

Example

python cifar10.py [--optim.name {sam,sgd}] [--model {renst20, wrn28_2}] [--optim.rho 0.05]

Results: Test Accuracy (CIFAR-10)

Model	SAM	SGD
ResNet-20	93.5	93.2
WRN28-2	95.8	95.4
ResNeXT29	96.4	95.8

SAM needs double forward passes per each update, thus training with SAM is slower than training with SGD. In case of ResNet-20 training, 80 mins vs 50 mins on my environment. Additional options --use_amp --jit_model may slightly accelerates the training.

Usage

SAMSGD can be used as a drop-in replacement of PyTorch optimizers with closures. Also, it is compatible with lr_scheduler and has state_dict and load_state_dict.

from sam import SAMSGD

optimizer = SAMSGD(model.parameters(), lr=1e-1, rho=0.05)

for input, target in dataset:
    def closure():
        optimizer.zero_grad()
        output = model(input)
        loss = loss_f(output, target)
        loss.backward()
        return loss


    loss = optimizer.step(closure)

Citation

@ARTICLE{2020arXiv201001412F,
    author = {{Foret}, Pierre and {Kleiner}, Ariel and {Mobahi}, Hossein and {Neyshabur}, Behnam},
    title = "{Sharpness-Aware Minimization for Efficiently Improving Generalization}",
    year = 2020,
    eid = {arXiv:2010.01412},
    eprint = {2010.01412},
}

@software{sampytorch
    author = {Ryuichiro Hataya},
    titile = {sam.pytorch},
    url    = {https://github.com/moskomule/sam.pytorch},
    year   = {2020}
}

A PyTorch implementation of Sharpness-Aware Minimization for Efficiently Improving Generalization

Related tags

Overview

sam.pytorch

Requirements

Example

Results: Test Accuracy (CIFAR-10)

Usage

Citation

Owner

Ryuichiro Hataya

Source Code For Template-Based Named Entity Recognition Using BART

IndoNLI: A Natural Language Inference Dataset for Indonesian

An official implementation of the Anchor DETR.

Semi-supervised Domain Adaptation via Minimax Entropy

This code is the implementation of the paper "Coherence-Based Distributed Document Representation Learning for Scientific Documents".

Parameter-ensemble-differential-evolution - Shows how to do parameter ensembling using differential evolution.

Code for CoMatch: Semi-supervised Learning with Contrastive Graph Regularization

NCVX (NonConVeX): A User-Friendly and Scalable Package for Nonconvex Optimization in Machine Learning.

Accommodating supervised learning algorithms for the historical prices of the world's favorite cryptocurrency and boosting it through LightGBM.

RMNA: A Neighbor Aggregation-Based Knowledge Graph Representation Learning Model Using Rule Mining

docTR by Mindee (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.

Official PyTorch implementation of the ICRA 2021 paper: Adversarial Differentiable Data Augmentation for Autonomous Systems.

Distributed Deep learning with Keras & Spark

Image Completion with Deep Learning in TensorFlow

RealFormer-Pytorch Implementation of RealFormer using pytorch

Implementation of character based convolutional neural network

Prompt Tuning with Rules

Freecodecamp Scientific Computing with Python Certification; Solution for Challenge 2: Time Calculator

Pytorch Implementation of Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations

LeafSnap replicated using deep neural networks to test accuracy compared to traditional computer vision methods.