Efficient Sharpness-aware Minimization for Improved Training of Neural Networks

Code for “Efficient Sharpness-aware Minimization for Improved Training of Neural Networks”

Requisite

This code is implemented in PyTorch, and we have tested the code under the following environment settings:

python = 3.8.8
torch = 1.8.0
torchvision = 0.9.0

What is in this repository

Codes for our ESAM on CIFAR10/CIFAR100 datasets.

How to use it

from utils.layer_dp_sam import ESAM
base_optimizer = torch.optim.SGD(model.parameters(),lr=args.learning_rate,momentum=0.9,weight_decay=args.weight_decay)
optimizer = ESAM(paras, base_optimizer, rho=args.rho, weight_dropout=args.weight_dropout,adaptive=args.isASAM,nograd_cutoff=args.nograd_cutoff,opt_dropout = args.opt_dropout,temperature=args.temperature)

--beta the SWP hyperparameter

--gamma the SDS hyperparameter

During training loss_fct should have reduction="none", to return instance-wise losses. defined_backward is the function used for DDP and mixed precision backward

loss_fct = torch.nn.CrossEntropyLoss(reduction="none")
def defined_backward():
    if args.fp16:
    with amp.scale_loss(loss, optimizer0) as scaled_loss:
        scaled_loss.backward()
    else:
        loss.backward()

paras = [inputs,targets,loss_fct,model,defined_backward]
optimizer.paras = paras
optimizer.step()
predictions_logits,loss = optimizer.returnthings

Example

bash run.sh

Reference Code

[1] SAM

Efficient Sharpness-aware Minimization for Improved Training of Neural Networks

Related tags

Overview

Efficient Sharpness-aware Minimization for Improved Training of Neural Networks

Requisite

What is in this repository

How to use it

Example

Reference Code

Owner

Angusdu

scalingscattering

(CVPR 2022) A minimalistic mapless end-to-end stack for joint perception, prediction, planning and control for self driving.

HyperLib: Deep learning in the Hyperbolic space

SVG Icon processing tool for C++

Collection of Docker images for ML/DL and video processing projects

It is a system used to detect bone fractures. using techniques deep learning and image processing

MonoRec: Semi-Supervised Dense Reconstruction in Dynamic Environments from a Single Moving Camera

PyExplainer: A Local Rule-Based Model-Agnostic Technique (Explainable AI)

Analyzing basic network responses to novel classes

An NVDA add-on to split screen reader and audio from other programs to different sound channels

Progressive Domain Adaptation for Object Detection

Tensorflow implementation of DeepLabv2

Simple renderer for use with MuJoCo (>=2.1.2) Python Bindings.

The code for MM2021 paper "Multi-Level Counterfactual Contrast for Visual Commonsense Reasoning"

Stochastic Extragradient: General Analysis and Improved Rates

An efficient implementation of GPNN

Match SafeGraph POIs with Data collected through a cultural resource survey in Washington DC.

A free, multiplatform SDK for real-time facial motion capture using blendshapes, and rigid head pose in 3D space from any RGB camera, photo, or video.

Pytorch implementation of Deep Recursive Residual Network for Super Resolution (DRRN)

External Attention Network