Code for "On Memorization in Probabilistic Deep Generative Models"

Overview

On Memorization in Probabilistic Deep Generative Models

This repository contains the code necessary to reproduce the experiments in On Memorization in Probabilistic Deep Generative Models. You can also use this code to measure memorization in other types of probabilistic deep generative models. If you use our code in your own work please cite the paper using, for instance, the following BibTeX entry:

@article{van2021memorization,
  title={On Memorization in Probabilistic Deep Generative Models},
  author={{Van den Burg}, G. J. J. and Williams, C. K. I.},
  journal={arXiv preprint arXiv:2106.03216},
  year={2021}
}

If you have any questions or encounter an issue when using this code, please send an email to gertjanvandenburg at gmail dot com.

Introduction

The files in the scripts directory are needed to reproduce the experiments and generate the figures in the paper. The experiments are organized using the Makefile provided. To reproduce the experiments or recreate the figures from the analysis, you'll have to install a number of dependencies. We use PyTorch to implement the deep learning algorithms. If you don't wish to re-run all the models, you can download the result files used in the paper (see below).

The scripts are all written in Python, and the necessary external dependencies can be found in the requirements.txt file. These can be installed using:

$ pip install -r requirements.txt

To recreate the figures the following system dependencies are also needed: pdflatex, latexmk, lualatex, and make. These programs are available for all major platforms.

Reproducing the results

To train the models on the different data sets, you can run:

$ make memorization

Note that depending on your machine this may take some time, so it might be easier to simply download the result files instead. It is also worth mentioning that while we have made an effort to ensure reproducibility by setting the random seed in PyTorch, platform or package version differences may result in slightly different output files (see also PyTorch Reproducibility).

All figures in the paper are generated from the raw result files using Python scripts. First, the summarize.py script takes the raw result files and creates summary files for each data set. Next, the analysis scripts are used to generate the figures, most of which are LaTeX files that require compilation using PDFLaTeX or LuaLaTeX. Simply run:

$ make analysis

to create the summaries and the output files. When using the result files linked below this will give the exact same figures as shown in the paper.

Result files

Due to their size, the raw result files are not contained in this repository, but can be downloaded separately from this link (about 2.6GB). After downloading the results.zip file, unpack it and move the results directory to where you've cloned this repository (so adjacent to the scripts directory). Below is a concise overview of the necessary commands:

$ git clone https://github.com/alan-turing-institute/memorization
$ cd memorization
$ wget https://gertjanvandenburg.com/projects/memorization/results.zip # or download the file in some other way
$ unzip results.zip
$ touch results/*/*/*          # update modification time of the result files
$ make analysis                # optionally, run ``make -n analysis`` first to see what will happen

After unpacking the zip file, you can optionally verify the integrity of the results using the SHA-256 checksums provided:

$ sha256sum --check results.sha256

License

The code in this repository is licensed under the MIT license. See the LICENSE file for further details. Reuse of the code in this repository is allowed, but should cite our paper.

Notes

If you find any problems or have a suggestion for improvement of this repository, please let me know as it will help make this resource better for everyone.

Owner
The Alan Turing Institute
The UK's national institute for data science and artificial intelligence.
The Alan Turing Institute
Full Transformer Framework for Robust Point Cloud Registration with Deep Information Interaction

Full Transformer Framework for Robust Point Cloud Registration with Deep Information Interaction. arxiv This repository contains python scripts for tr

12 Dec 12, 2022
ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection

ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection This repository contains implementation of the

Visual Understanding Lab @ Samsung AI Center Moscow 190 Dec 30, 2022
Computer Vision application in the web

Computer Vision application in the web Preview Usage Clone this repo git clone https://github.com/amineHY/WebApp-Computer-Vision-streamlit.git cd Web

Amine Hadj-Youcef. PhD 35 Dec 06, 2022
This repository contains the source code of Auto-Lambda and baselines from the paper, Auto-Lambda: Disentangling Dynamic Task Relationships.

Auto-Lambda This repository contains the source code of Auto-Lambda and baselines from the paper, Auto-Lambda: Disentangling Dynamic Task Relationship

Shikun Liu 76 Dec 20, 2022
A tiny, friendly, strong baseline code for Person-reID (based on pytorch).

Pytorch ReID Strong, Small, Friendly A tiny, friendly, strong baseline code for Person-reID (based on pytorch). Strong. It is consistent with the new

Zhedong Zheng 3.5k Jan 08, 2023
Implementation of SiameseXML (ICML 2021)

SiameseXML Code for SiameseXML: Siamese networks meet extreme classifiers with 100M labels Best Practices for features creation Adding sub-words on to

Extreme Classification 35 Nov 06, 2022
A Robust Unsupervised Ensemble of Feature-Based Explanations using Restricted Boltzmann Machines

A Robust Unsupervised Ensemble of Feature-Based Explanations using Restricted Boltzmann Machines Understanding the results of deep neural networks is

Johan van den Heuvel 2 Dec 13, 2021
This is an official implementation for "PlaneRecNet".

PlaneRecNet This is an official implementation for PlaneRecNet: A multi-task convolutional neural network provides instance segmentation for piece-wis

yaxu 50 Nov 17, 2022
[CVPR 2022] Deep Equilibrium Optical Flow Estimation

Deep Equilibrium Optical Flow Estimation This is the official repo for the paper Deep Equilibrium Optical Flow Estimation (CVPR 2022), by Shaojie Bai*

CMU Locus Lab 136 Dec 18, 2022
This computer program provides a reference implementation of Lagrangian Monte Carlo in metric induced by the Monge patch

This computer program provides a reference implementation of Lagrangian Monte Carlo in metric induced by the Monge patch. The code was prepared to the final version of the accepted manuscript in AIST

Marcelo Hartmann 2 May 06, 2022
A python tutorial on bayesian modeling techniques (PyMC3)

Bayesian Modelling in Python Welcome to "Bayesian Modelling in Python" - a tutorial for those interested in learning how to apply bayesian modelling t

Mark Regan 2.4k Jan 06, 2023
Official Pytorch Implementation of Length-Adaptive Transformer (ACL 2021)

Length-Adaptive Transformer This is the official Pytorch implementation of Length-Adaptive Transformer. For detailed information about the method, ple

Clova AI Research 93 Dec 28, 2022
Subgraph Based Learning of Contextual Embedding

SLiCE Self-Supervised Learning of Contextual Embeddings for Link Prediction in Heterogeneous Networks Dataset details: We use four public benchmark da

Pacific Northwest National Laboratory 27 Dec 01, 2022
VarCLR: Variable Semantic Representation Pre-training via Contrastive Learning

    VarCLR: Variable Representation Pre-training via Contrastive Learning New: Paper accepted by ICSE 2022. Preprint at arXiv! This repository contain

squaresLab 32 Oct 24, 2022
Animate molecular orbital transitions using Psi4 and Blender

Molecular Orbital Transitions (MOT) Animate molecular orbital transitions using Psi4 and Blender Author: Maximilian Paradiz Dominguez, University of A

3 Feb 01, 2022
TUPÃ was developed to analyze electric field properties in molecular simulations

TUPÃ: Electric field analyses for molecular simulations What is TUPÃ? TUPÃ (pronounced as tu-pan) is a python algorithm that employs MDAnalysis engine

Marcelo D. Polêto 10 Jul 17, 2022
Masked regression code - Masked Regression

Masked Regression MR - Python Implementation This repositery provides a python implementation of MR (Masked Regression). MR can efficiently synthesize

Arbish Akram 1 Dec 23, 2021
This implementation contains the application of GPlearn's symbolic transformer on a commodity futures sector of the financial market.

GPlearn_finiance_stock_futures_extension This implementation contains the application of GPlearn's symbolic transformer on a commodity futures sector

Chengwei <a href=[email protected]"> 189 Dec 25, 2022
SwinIR: Image Restoration Using Swin Transformer

SwinIR: Image Restoration Using Swin Transformer This repository is the official PyTorch implementation of SwinIR: Image Restoration Using Shifted Win

Jingyun Liang 2.4k Jan 08, 2023
Here I will explain the flow to deploy your custom deep learning models on Ultra96V2.

Xilinx_Vitis_AI This repo will help you to Deploy your Deep Learning Model on Ultra96v2 Board. Prerequisites Vitis Core Development Kit 2019.2 This co

Amin Mamandipoor 1 Feb 08, 2022