Efficient Sharpness-aware Minimization for Improved Training of Neural Networks

Code for “Efficient Sharpness-aware Minimization for Improved Training of Neural Networks”

Requisite

This code is implemented in PyTorch, and we have tested the code under the following environment settings:

python = 3.8.8
torch = 1.8.0
torchvision = 0.9.0

What is in this repository

Codes for our ESAM on CIFAR10/CIFAR100 datasets.

How to use it

from utils.layer_dp_sam import ESAM
base_optimizer = torch.optim.SGD(model.parameters(),lr=args.learning_rate,momentum=0.9,weight_decay=args.weight_decay)
optimizer = ESAM(paras, base_optimizer, rho=args.rho, weight_dropout=args.weight_dropout,adaptive=args.isASAM,nograd_cutoff=args.nograd_cutoff,opt_dropout = args.opt_dropout,temperature=args.temperature)

--beta the SWP hyperparameter

--gamma the SDS hyperparameter

During training loss_fct should have reduction="none", to return instance-wise losses. defined_backward is the function used for DDP and mixed precision backward

loss_fct = torch.nn.CrossEntropyLoss(reduction="none")
def defined_backward():
    if args.fp16:
    with amp.scale_loss(loss, optimizer0) as scaled_loss:
        scaled_loss.backward()
    else:
        loss.backward()

paras = [inputs,targets,loss_fct,model,defined_backward]
optimizer.paras = paras
optimizer.step()
predictions_logits,loss = optimizer.returnthings

Example

bash run.sh

Reference Code

[1] SAM

Efficient Sharpness-aware Minimization for Improved Training of Neural Networks

Related tags

Overview

Efficient Sharpness-aware Minimization for Improved Training of Neural Networks

Requisite

What is in this repository

How to use it

Example

Reference Code

Owner

Angusdu

Semiconductor Machine learning project

novel deep learning research works with PaddlePaddle

Code for 'Blockwise Sequential Model Learning for Partially Observable Reinforcement Learning' (AAAI 2022)

Unicorn can be used for performance analyses of highly configurable systems with causal reasoning

Repository for reproducing `Model-Based Robust Deep Learning`

DETReg: Unsupervised Pretraining with Region Priors for Object Detection

Our implementation used for the MICCAI 2021 FLARE Challenge titled 'Efficient Multi-Organ Segmentation Using SpatialConfiguartion-Net with Low GPU Memory Requirements'.

Rotation-Only Bundle Adjustment

This is the repository of the NeurIPS 2021 paper "Curriculum Disentangled Recommendation withNoisy Multi-feedback"

Unofficial implementation (replicates paper results!) of MINER: Multiscale Implicit Neural Representations in pytorch-lightning

The Most Efficient Temporal Difference Learning Framework for 2048

SPCL: A New Framework for Domain Adaptive Semantic Segmentation via Semantic Prototype-based Contrastive Learning

IDA file loader for UF2, created for the DEFCON 29 hardware badge

(CVPR2021) Kaleido-BERT: Vision-Language Pre-training on Fashion Domain

DynaTune: Dynamic Tensor Program Optimization in Deep Neural Network Compilation

Implementation for the "Surface Reconstruction from 3D Line Segments" paper.

DeepFaceEditing: Deep Face Generation and Editing with Disentangled Geometry and Appearance Control

MetaDrive: Composing Diverse Scenarios for Generalizable Reinforcement Learning

StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators

Convert ONNX model graph to Keras model format.