PyTorch implementation for Score-Based Generative Modeling through Stochastic Differential Equations (ICLR 2021, Oral)

Overview

Score-Based Generative Modeling through Stochastic Differential Equations

PWC

This repo contains a PyTorch implementation for the paper Score-Based Generative Modeling through Stochastic Differential Equations

by Yang Song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole


We propose a unified framework that generalizes and improves previous work on score-based generative models through the lens of stochastic differential equations (SDEs). In particular, we can transform data to a simple noise distribution with a continuous-time stochastic process described by an SDE. This SDE can be reversed for sample generation if we know the score of the marginal distributions at each intermediate time step, which can be estimated with score matching. The basic idea is captured in the figure below:

schematic

Our work enables a better understanding of existing approaches, new sampling algorithms, exact likelihood computation, uniquely identifiable encoding, latent code manipulation, and brings new conditional generation abilities (including but not limited to class-conditional generation, inpainting and colorization) to the family of score-based generative models.

All combined, we achieved an FID of 2.20 and an Inception score of 9.89 for unconditional generation on CIFAR-10, as well as high-fidelity generation of 1024px Celeba-HQ images (samples below). In addition, we obtained a likelihood value of 2.99 bits/dim on uniformly dequantized CIFAR-10 images.

FFHQ samples

What does this code do?

Aside from the NCSN++ and DDPM++ models in our paper, this codebase also re-implements many previous score-based models in one place, including NCSN from Generative Modeling by Estimating Gradients of the Data Distribution, NCSNv2 from Improved Techniques for Training Score-Based Generative Models, and DDPM from Denoising Diffusion Probabilistic Models.

It supports training new models, evaluating the sample quality and likelihoods of existing models. We carefully designed the code to be modular and easily extensible to new SDEs, predictors, or correctors.

JAX version

Please find a JAX implementation here, which additionally supports class-conditional generation with a pre-trained classifier, and resuming an evalution process after pre-emption.

JAX vs. PyTorch

In general, this PyTorch version consumes less memory but runs slower than JAX. Here is a benchmark on training an NCSN++ cont. model with VE SDE. Hardware is 4x Nvidia Tesla V100 GPUs (32GB)

Framework Time (second per step) Memory usage in total (GB)
PyTorch 0.56 20.6
JAX (n_jitted_steps=1) 0.30 29.7
JAX (n_jitted_steps=5) 0.20 74.8

How to run the code

Dependencies

Run the following to install a subset of necessary python packages for our code

pip install -r requirements.txt

Stats files for quantitative evaluation

We provide the stats file for CIFAR-10. You can download cifar10_stats.npz and save it to assets/stats/. Check out #5 on how to compute this stats file for new datasets.

Usage

Train and evaluate our models through main.py.

main.py:
  --config: Training configuration.
    (default: 'None')
  --eval_folder: The folder name for storing evaluation results
    (default: 'eval')
  --mode: <train|eval>: Running mode: train or eval
  --workdir: Working directory
  • config is the path to the config file. Our prescribed config files are provided in configs/. They are formatted according to ml_collections and should be quite self-explanatory.

    Naming conventions of config files: the path of a config file is a combination of the following dimensions:

    • dataset: One of cifar10, celeba, celebahq, celebahq_256, ffhq_256, celebahq, ffhq.
    • model: One of ncsn, ncsnv2, ncsnpp, ddpm, ddpmpp.
    • continuous: train the model with continuously sampled time steps.
  • workdir is the path that stores all artifacts of one experiment, like checkpoints, samples, and evaluation results.

  • eval_folder is the name of a subfolder in workdir that stores all artifacts of the evaluation process, like meta checkpoints for pre-emption prevention, image samples, and numpy dumps of quantitative results.

  • mode is either "train" or "eval". When set to "train", it starts the training of a new model, or resumes the training of an old model if its meta-checkpoints (for resuming running after pre-emption in a cloud environment) exist in workdir/checkpoints-meta . When set to "eval", it can do an arbitrary combination of the following

    • Evaluate the loss function on the test / validation dataset.

    • Generate a fixed number of samples and compute its Inception score, FID, or KID. Prior to evaluation, stats files must have already been downloaded/computed and stored in assets/stats.

    • Compute the log-likelihood on the training or test dataset.

    These functionalities can be configured through config files, or more conveniently, through the command-line support of the ml_collections package. For example, to generate samples and evaluate sample quality, supply the --config.eval.enable_sampling flag; to compute log-likelihoods, supply the --config.eval.enable_bpd flag, and specify --config.eval.dataset=train/test to indicate whether to compute the likelihoods on the training or test dataset.

How to extend the code

  • New SDEs: inherent the sde_lib.SDE abstract class and implement all abstract methods. The discretize() method is optional and the default is Euler-Maruyama discretization. Existing sampling methods and likelihood computation will automatically work for this new SDE.
  • New predictors: inherent the sampling.Predictor abstract class, implement the update_fn abstract method, and register its name with @register_predictor. The new predictor can be directly used in sampling.get_pc_sampler for Predictor-Corrector sampling, and all other controllable generation methods in controllable_generation.py.
  • New correctors: inherent the sampling.Corrector abstract class, implement the update_fn abstract method, and register its name with @register_corrector. The new corrector can be directly used in sampling.get_pc_sampler, and all other controllable generation methods in controllable_generation.py.

Pretrained checkpoints

All checkpoints are provided in this Google drive.

Instructions: You may find two checkpoints for some models. The first checkpoint (with a smaller number) is the one that we reported FID scores in our paper's Table 3 (also corresponding to the FID and IS columns in the table below). The second checkpoint (with a larger number) is the one that we reported likelihood values and FIDs of black-box ODE samplers in our paper's Table 2 (also FID(ODE) and NNL (bits/dim) columns in the table below). The former corresponds to the smallest FID during the course of training (every 50k iterations). The later is the last checkpoint during training.

Per Google's policy, we cannot release our original CelebA and CelebA-HQ checkpoints. That said, I have re-trained models on FFHQ 1024px, FFHQ 256px and CelebA-HQ 256px with personal resources, and they achieved similar performance to our internal checkpoints.

Here is a detailed list of checkpoints and their results reported in the paper. FID (ODE) corresponds to the sample quality of black-box ODE solver applied to the probability flow ODE.

Checkpoint path FID IS FID (ODE) NNL (bits/dim)
ve/cifar10_ncsnpp/ 2.45 9.73 - -
ve/cifar10_ncsnpp_continuous/ 2.38 9.83 - -
ve/cifar10_ncsnpp_deep_continuous/ 2.20 9.89 - -
vp/cifar10_ddpm/ 3.24 - 3.37 3.28
vp/cifar10_ddpm_continuous - - 3.69 3.21
vp/cifar10_ddpmpp 2.78 9.64 - -
vp/cifar10_ddpmpp_continuous 2.55 9.58 3.93 3.16
vp/cifar10_ddpmpp_deep_continuous 2.41 9.68 3.08 3.13
subvp/cifar10_ddpm_continuous - - 3.56 3.05
subvp/cifar10_ddpmpp_continuous 2.61 9.56 3.16 3.02
subvp/cifar10_ddpmpp_deep_continuous 2.41 9.57 2.92 2.99
Checkpoint path Samples
ve/bedroom_ncsnpp_continuous bedroom_samples
ve/church_ncsnpp_continuous church_samples
ve/ffhq_1024_ncsnpp_continuous ffhq_1024
ve/ffhq_256_ncsnpp_continuous ffhq_256_samples
ve/celebahq_256_ncsnpp_continuous celebahq_256_samples

Demonstrations and tutorials

Link Description
Open In Colab Load our pretrained checkpoints and play with sampling, likelihood computation, and controllable synthesis (JAX + FLAX)
Open In Colab Load our pretrained checkpoints and play with sampling, likelihood computation, and controllable synthesis (PyTorch)
Open In Colab Tutorial of score-based generative models in JAX + FLAX
Open In Colab Tutorial of score-based generative models in PyTorch

Tips

  • When using the JAX codebase, you can jit multiple training steps together to improve training speed at the cost of more memory usage. This can be set via config.training.n_jitted_steps. For CIFAR-10, we recommend using config.training.n_jitted_steps=5 when your GPU/TPU has sufficient memory; otherwise we recommend using config.training.n_jitted_steps=1. Our current implementation requires config.training.log_freq to be dividable by n_jitted_steps for logging and checkpointing to work normally.
  • The snr (signal-to-noise ratio) parameter of LangevinCorrector somewhat behaves like a temperature parameter. Larger snr typically results in smoother samples, while smaller snr gives more diverse but lower quality samples. Typical values of snr is 0.05 - 0.2, and it requires tuning to strike the sweet spot.
  • For VE SDEs, we recommend choosing config.model.sigma_max to be the maximum pairwise distance between data samples in the training dataset.

References

If you find the code useful for your research, please consider citing

@inproceedings{
  song2021scorebased,
  title={Score-Based Generative Modeling through Stochastic Differential Equations},
  author={Yang Song and Jascha Sohl-Dickstein and Diederik P Kingma and Abhishek Kumar and Stefano Ermon and Ben Poole},
  booktitle={International Conference on Learning Representations},
  year={2021},
  url={https://openreview.net/forum?id=PxTIG12RRHS}
}

This work is built upon some previous papers which might also interest you:

  • Song, Yang, and Stefano Ermon. "Generative Modeling by Estimating Gradients of the Data Distribution." Proceedings of the 33rd Annual Conference on Neural Information Processing Systems. 2019.
  • Song, Yang, and Stefano Ermon. "Improved techniques for training score-based generative models." Proceedings of the 34th Annual Conference on Neural Information Processing Systems. 2020.
  • Ho, Jonathan, Ajay Jain, and Pieter Abbeel. "Denoising diffusion probabilistic models." Proceedings of the 34th Annual Conference on Neural Information Processing Systems. 2020.
Owner
Yang Song
PhD Candidate in Stanford AI Lab
Yang Song
kullanışlı ve işinizi kolaylaştıracak bir araç

Hey merhaba! işte çok sorulan sorularının cevabı ve sorunlarının çözümü; Soru= İçinde var denilen birçok şeyi göremiyorum bunun sebebi nedir? Cevap= B

Sexettin 16 Dec 17, 2022
Code accompanying the paper "ProxyFL: Decentralized Federated Learning through Proxy Model Sharing"

ProxyFL Code accompanying the paper "ProxyFL: Decentralized Federated Learning through Proxy Model Sharing" Authors: Shivam Kalra*, Junfeng Wen*, Jess

Layer6 Labs 14 Dec 06, 2022
This was initially the repo for the project of [email protected] of Asaf Mazar, Millad Kassaie and Georgios Chochlakis named "Powered by the Will? Exploring Lay Theories of Behavior Change through Social Media"

Subreddit Analysis This repo includes tools for Subreddit analysis, originally developed for our class project of PSYC 626 in USC, titled "Powered by

Georgios Chochlakis 1 Dec 17, 2021
Code and data for the EMNLP 2021 paper "Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts". Coming soon!

ToxiChat Code and data for the EMNLP 2021 paper "Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts". Install depen

Ashutosh Baheti 11 Jan 01, 2023
Image based Human Fall Detection

Here I integrated the YOLOv5 object detection algorithm with my own created dataset which consists of human activity images to achieve low cost, high accuracy, and real-time computing requirements

UTTEJ KUMAR 12 Dec 11, 2022
Numbering permanent and deciduous teeth via deep instance segmentation in panoramic X-rays

Numbering permanent and deciduous teeth via deep instance segmentation in panoramic X-rays In this repo, you will find the instructions on how to requ

Intelligent Vision Research Lab 4 Jul 21, 2022
OpenMMLab Video Perception Toolbox. It supports Video Object Detection (VID), Multiple Object Tracking (MOT), Single Object Tracking (SOT), Video Instance Segmentation (VIS) with a unified framework.

English | 简体中文 Documentation: https://mmtracking.readthedocs.io/ Introduction MMTracking is an open source video perception toolbox based on PyTorch.

OpenMMLab 2.7k Jan 08, 2023
Devkit for 3D -- Some utils for 3D object detection based on Numpy and Pytorch

D3D Devkit for 3D: Some utils for 3D object detection and tracking based on Numpy and Pytorch Please consider siting my work if you find this library

Jacob Zhong 27 Jul 07, 2022
Cancer-and-Tumor-Detection-Using-Inception-model - In this repo i am gonna show you how i did cancer/tumor detection in lungs using deep neural networks, specifically here the Inception model by google.

Cancer-and-Tumor-Detection-Using-Inception-model In this repo i am gonna show you how i did cancer/tumor detection in lungs using deep neural networks

Deepak Nandwani 1 Jan 01, 2022
dataset for ECCV 2020 "Motion Capture from Internet Videos"

Motion Capture from Internet Videos Motion Capture from Internet Videos Junting Dong*, Qing Shuai*, Yuanqing Zhang, Xian Liu, Xiaowei Zhou, Hujun Bao

ZJU3DV 98 Dec 07, 2022
A custom DeepStack model that has been trained detecting ONLY the USPS logo

This repository provides a custom DeepStack model that has been trained detecting ONLY the USPS logo. This was created after I discovered that the Deepstack OpenLogo custom model I was using did not

Stephen Stratoti 9 Dec 27, 2022
Official repository of my book: "Deep Learning with PyTorch Step-by-Step: A Beginner's Guide"

This is the official repository of my book "Deep Learning with PyTorch Step-by-Step". Here you will find one Jupyter notebook for every chapter in the book.

Daniel Voigt Godoy 340 Jan 01, 2023
Churn prediction

Churn-prediction Churn-prediction Data preprocessing:: Label encoder is used to normalize the categorical variable Data Transformation:: For each data

1 Sep 28, 2022
Code release for Hu et al. Segmentation from Natural Language Expressions. in ECCV, 2016

Segmentation from Natural Language Expressions This repository contains the code for the following paper: R. Hu, M. Rohrbach, T. Darrell, Segmentation

Ronghang Hu 88 May 24, 2022
Anatomy of Matplotlib -- tutorial developed for the SciPy conference

Introduction This tutorial is a complete re-imagining of how one should teach users the matplotlib library. Hopefully, this tutorial may serve as insp

Matplotlib Developers 1.1k Dec 29, 2022
Toward Multimodal Image-to-Image Translation

BicycleGAN Project Page | Paper | Video Pytorch implementation for multimodal image-to-image translation. For example, given the same night image, our

Jun-Yan Zhu 1.4k Dec 22, 2022
Official Pytorch implementation of "Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video", CVPR 2021

TCMR: Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video Qualtitative result Paper teaser video Introduction This r

Hongsuk Choi 215 Jan 06, 2023
Code for the paper "Next Generation Reservoir Computing"

Next Generation Reservoir Computing This is the code for the results and figures in our paper "Next Generation Reservoir Computing". They are written

OSU QuantInfo Lab 105 Dec 20, 2022
Encoding Causal Macrovariables

Encoding Causal Macrovariables Data Natural climate data ('El Nino') Self-generated data ('Simulated') Experiments Detecting macrovariables through th

Benedikt Höltgen 3 Jul 31, 2022
Tensorflow implementation of MIRNet for Low-light image enhancement

MIRNet Tensorflow implementation of the MIRNet architecture as proposed by Learning Enriched Features for Real Image Restoration and Enhancement. Lanu

Soumik Rakshit 91 Jan 06, 2023