ViViT: Curvature access through the generalized Gauss-Newton's low-rank structure

Last update: Dec 08, 2022

Related tags

Overview

[ 👷 🏗 👷 🏗 Coming soon! Official release with improved docs. Stay tuned. 👷 🏗 👷 🏗 ]

ViViT: Curvature access through the generalized Gauss-Newton's low-rank structure

[]

ViViT is a collection of numerical tricks to efficiently access curvature from the generalized Gauss-Newton (GGN) matrix based on its low-rank structure. Provided functionality includes computing

GGN eigenvalues
GGN eigenpairs (eigenvalues + eigenvector)
1ˢᵗ- and 2ⁿᵈ-order directional derivatives along GGN eigenvectors
Newton steps

These operations can also further approximate the GGN to reduce cost via sub-sampling, Monte-Carlo approximation, and block-diagonal approximation.

How does it work? ViViT uses and extends BackPACK for PyTorch. The described functionality is realized through a combination of existing and new BackPACK extensions and hooks into its backpropagation.

Installation

👷 🏗 👷 🏗 The PyPI release is coming soon. 👷 🏗 👷 🏗

For now, you need to install from GitHub via

pip install vivit-for-pytorch@git+https://github.com/f-dangel/vivit.git#egg=vivit-for-pytorch

Examples

👷 🏗 👷 🏗 Coming soon! 👷 🏗 👷 🏗

How to cite

If you are using ViViT, consider citing the paper

@misc{dangel2022vivit,
      title={{ViViT}: Curvature access through the generalized Gauss-Newton's low-rank structure},
      author={Felix Dangel and Lukas Tatzel and Philipp Hennig},
      year={2022},
      eprint={2106.02624},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Comments

[ADD] Warn about instabilities if eigenvalues are small

The directional gradient computation and transformation of the Newton step from Gram space into parameter space require division by the square root of the direction's eigenvalue. This is unstable if the eigenvalue is close to zero.

opened by f-dangel 1
[ADD] Clean `DirectionalDampedNewtonComputation`
Adds directionally damped Newton step computation with cleaned up API.

Fixes a bug in the eigenvalue criterion in the tests. It always picked one more eigenvalue than specified.
opened by f-dangel 1
[DOC] Add NTK example

Adds an example inspired by the functorch tutorial on NTKs. It demonstrates how to use vivit to compute empirical NTK matrices and makes a comparison with the functorch implementation.

opened by f-dangel 1
[ADD] Simplify `DirectionalDerivatives` API
Exotic features, like using different GGNs to compute directions and directional curvatures, as well as full control of which intermediate buffers to keep, have been deprecated in favor of a simpler API.

Remove Newton step computation for now as it was internally relying on DirectionalDerivatives

Remove many utilities and associated tests from the exotic features

Forbid duplicate indices in subsampling

Always delete intermediate buffers other than the target quantities
opened by f-dangel 1
[DOC] Set up `sphinx` and RTD

This PR adds a scaffold for the doc at https://vivit.readthedocs.io/en/latest/. Code examples are integrated via sphinx-gallery (I added a preliminary logo). Pull requests are built by the CI.

To build the docs, run make docs. You need to install the dependencies first, for example using pip install -e .[docs].

opened by f-dangel 1
Calculate Parameter Space Values of GGN Eigenvectors

The docs show how to calculate the gram matrix eigenvectors and the paper articulates that to translate from 'gram space' to parameter space we just need to multiply by the 'V' matrix.

What's the easiest way of implementing this?
question

opened by lk-wq 1
Detect loss function's `reduction`, error if unsupported
For now, the library only supports reduction='mean'. We rely on the user to use this reduction and raise awareness about this point in the documentation. It would be better to automatically have the library detect the reduction and error if it is unsupported.

This can be done via a hook into BackPACK.

[ ] Implement hook that determines the loss function reduction during backpropagation

[ ] Integrate the above hook into the *Computation and raise an exception if the reduction is not supported

[ ] Remove the comments about supported reductions in the documentation

enhancement
opened by f-dangel 0

Releases(1.0.0)

1.0.0(Jun 22, 2022)

First public release. Details about future releases will be documented in the changelog.
Source code(tar.gz)
Source code(zip)

Owner

Felix Dangel

Machine Learning PhD student at the University of Tübingen and the Max Planck Institute for Intelligent Systems.

GitHub Repository https://arxiv.org/abs/2106.02624

Consecutive-Subsequence - Simple software to calculate susequence with highest sum

Simple software to calculate susequence with highest sum This repository contain

1 Jan 31, 2022

Learning from Guided Play: A Scheduled Hierarchical Approach for Improving Exploration in Adversarial Imitation Learning Source Code

8 Sep 14, 2022

Bayesian Meta-Learning Through Variational Gaussian Processes

vmgp This is the repository of Vivek Myers and Nikhil Sardana for our CS 330 final project, Bayesian Meta-Learning Through Variational Gaussian Proces

2 Nov 17, 2022

Apache Flink

Apache Flink Apache Flink is an open source stream processing framework with powerful stream- and batch-processing capabilities. Learn more about Flin

20.4k Dec 30, 2022

Semi-Supervised Semantic Segmentation via Adaptive Equalization Learning, NeurIPS 2021 (Spotlight)

Semi-Supervised Semantic Segmentation via Adaptive Equalization Learning, NeurIPS 2021 (Spotlight) Abstract Due to the limited and even imbalanced dat

99 Dec 12, 2022

This is the official code for the paper "Tracker Meets Night: A Transformer Enhancer for UAV Tracking".

SCT This is the official code for the paper "Tracker Meets Night: A Transformer Enhancer for UAV Tracking" The spatial-channel Transformer (SCT) enhan

27 Nov 23, 2022

MlTr: Multi-label Classification with Transformer

MlTr: Multi-label Classification with Transformer This is official implement of "MlTr: Multi-label Classification with Transformer". Abstract The task

38 Nov 08, 2022

Implementation of gaze tracking and demo

Predicting Customer Demand by Using Gaze Detecting and Object Tracking This project is the integration of gaze detecting and object tracking. Predict

2 Oct 20, 2022

A Re-implementation of the paper "A Deep Learning Framework for Character Motion Synthesis and Editing"

What is This This is a simple re-implementation of the paper "A Deep Learning Framework for Character Motion Synthesis and Editing"(1). Only Sections

102 Dec 14, 2022

Pytorch Implementation for Dilated Continuous Random Field

DilatedCRF Pytorch implementation for fully-learnable DilatedCRF. If you find my work helpful, please consider our paper: @article{Mo2022dilatedcrf,

3 Nov 13, 2022

The Official Implementation of Neural View Synthesis and Matching for Semi-Supervised Few-Shot Learning of 3D Pose [NIPS 2021].

Neural View Synthesis and Matching for Semi-Supervised Few-Shot Learning of 3D Pose Release Notes The offical PyTorch implementation of Neural View Sy

20 Oct 09, 2022

ViViT: Curvature access through the generalized Gauss-Newton's low-rank structure

Related tags

Overview

ViViT: Curvature access through the generalized Gauss-Newton's low-rank structure

Installation

Examples

How to cite

Comments

[ADD] Warn about instabilities if eigenvalues are small

[ADD] Clean `DirectionalDampedNewtonComputation`

[DOC] Add NTK example

[ADD] Simplify `DirectionalDerivatives` API

[DOC] Set up `sphinx` and RTD

Calculate Parameter Space Values of GGN Eigenvectors

Detect loss function's `reduction`, error if unsupported

Releases(1.0.0)

1.0.0(Jun 22, 2022)

Owner

Felix Dangel

Consecutive-Subsequence - Simple software to calculate susequence with highest sum

Learning from Guided Play: A Scheduled Hierarchical Approach for Improving Exploration in Adversarial Imitation Learning Source Code

Bayesian Meta-Learning Through Variational Gaussian Processes

Apache Flink

Semi-Supervised Semantic Segmentation via Adaptive Equalization Learning, NeurIPS 2021 (Spotlight)

This is the official code for the paper "Tracker Meets Night: A Transformer Enhancer for UAV Tracking".

MlTr: Multi-label Classification with Transformer

Implementation of gaze tracking and demo

A Re-implementation of the paper "A Deep Learning Framework for Character Motion Synthesis and Editing"

Pytorch Implementation for Dilated Continuous Random Field

The Official Implementation of Neural View Synthesis and Matching for Semi-Supervised Few-Shot Learning of 3D Pose [NIPS 2021].

Image-retrieval-baseline - MUGE Multimodal Retrieval Baseline

PN-Net a neural field-based framework for depth estimation from single-view RGB images.

Python library for science observations from the James Webb Space Telescope

A PyTorch implementation of the Relational Graph Convolutional Network (RGCN).

Attentional Focus Modulates Automatic Finger‑tapping Movements

Concept drift monitoring for HA model servers.

Madanalysis5 - A package for event file analysis and recasting of LHC results

SAFL: A Self-Attention Scene Text Recognizer with Focal Loss

CSAC - Collaborative Semantic Aggregation and Calibration for Separated Domain Generalization