A tf.keras implementation of Facebook AI's MadGrad optimization algorithm

Last update: Aug 18, 2022

Overview

MADGRAD Optimization Algorithm For Tensorflow

This package implements the MadGrad Algorithm proposed in Adaptivity without Compromise: A Momentumized, Adaptive, Dual Averaged Gradient Method for Stochastic Optimization (Aaron Defazio and Samy Jelassi, 2021).

Table of Contents

About The Project
Getting Started
- Prerequisites
- Installation
Usage
Contributing
License
Contact
Citations

About The Project

The MadGrad algorithm of optimization uses Dual averaging of gradients along with momentum based adaptivity to attain results that match or outperform Adam or SGD + momentum based algorithms. This project offers a Tensorflow implementation of the algorithm along with a few usage examples and tests.

Prerequisites

Prerequisites can be installed separately through the requirements.txt file as below

pip install -r requirements.txt

Installation

This project is built with Python 3 and can be pip installed directly

pip install tf-madgrad

Usage

To use the optimizer in any tf.keras model, you just need to import and instantiate the MadGrad optimizer from the tf_madgrad package.

from madgrad import MadGrad

# Create the architecture
inp = tf.keras.layers.Input(shape=shape)
...
op = tf.keras.layers.Dense(classes, activation=activation)

# Instantiate the model
model = tf.keras.models.Model(inp, op)

# Pass the MadGrad optimizer to the compile function
model.compile(optimizer=MadGrad(lr=0.01), loss=loss)

# Fit the keras model as normal
model.fit(...)

This implementation is also supported for distributed training using tf.strategy

See a MNIST example here

Contributing

Any and all contributions are welcome. Please raise an issue if the optimizer gives incorrect results or crashes unexpectedly during training.

License

Distributed under the MIT License. See LICENSE for more information.

Contact

Feel free to reach out for any issues or requests related to this implementation

Darshan Deshpande - Email | LinkedIn

Citations

@misc{defazio2021adaptivity,
      title={Adaptivity without Compromise: A Momentumized, Adaptive, Dual Averaged Gradient Method for Stochastic Optimization}, 
      author={Aaron Defazio and Samy Jelassi},
      year={2021},
      eprint={2101.11075},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

A tf.keras implementation of Facebook AI's MadGrad optimization algorithm

Related tags

Overview

MADGRAD Optimization Algorithm For Tensorflow

About The Project

Prerequisites

Installation

Usage

Contributing

License

Contact

Citations

Owner

This repository is related to an Arabic tutorial, within the tutorial we discuss the common data structure and algorithms and their worst and best case for each, then implement the code using Python.

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

[ICCV2021] Safety-aware Motion Prediction with Unseen Vehicles for Autonomous Driving

Wenzhou-Kean University AI-LAB

Python scripts using the Mediapipe models for Halloween.

NudeNet: Neural Nets for Nudity Classification, Detection and selective censoring

🌈 PyTorch Implementation for EMNLP'21 Findings "Reasoning Visual Dialog with Sparse Graph Learning and Knowledge Transfer"

Code for "Graph-Evolving Meta-Learning for Low-Resource Medical Dialogue Generation". [AAAI 2021]

Official repository for "Deep Recurrent Neural Network with Multi-scale Bi-directional Propagation for Video Deblurring".

Official Pytorch implementation of "Unbiased Classification Through Bias-Contrastive and Bias-Balanced Learning (NeurIPS 2021)

Neon-erc20-example - Example of creating SPL token and wrapping it with ERC20 interface in Neon EVM

Video Matting Refinement For Python

The BCNet related data and inference model.

[PNAS2021] The neural architecture of language: Integrative modeling converges on predictive processing

Semi-Supervised Graph Prototypical Networks for Hyperspectral Image Classification, IGARSS, 2021.

Patch Rotation: A Self-Supervised Auxiliary Task for Robustness and Accuracy of Supervised Models

State-of-the-art data augmentation search algorithms in PyTorch

A PyTorch implementation of EventProp [https://arxiv.org/abs/2009.08378], a method to train Spiking Neural Networks

Volumetric parameterization of the placenta to a flattened template

Solution of Kaggle competition: Sartorius - Cell Instance Segmentation