Learning What and Where to Draw

Last update: Nov 18, 2022

Related tags

Deep Learning nips2016

Overview

###Learning What and Where to Draw Scott Reed, Zeynep Akata, Santosh Mohan, Samuel Tenka, Bernt Schiele, Honglak Lee

This is the code for our NIPS 2016 paper on text- and location-controllable image synthesis using conditional GANs. Much of the code is adapted from reedscot/icml2016 and dcgan.torch.

####Setup Instructions

You will need to install Torch, CuDNN, stnbhwd and the display package.

####How to train a text to image model:

Download the data including captions, location annotations and pretrained models.
Download the birds and humans image data.
Modify the CONFIG file to point to your data.
Run one of the training scripts, e.g. ./scripts/train_cub_keypoints.sh

####How to generate samples:

./scripts/run_all_demos.sh.
html files will be generated with results like the following:

Moving the bird's position via bounding box:

Moving the bird's position via keypoints:

Birds text to image with ground-truth keypoints:

Birds text to image with generated keypoints:

Humans text to image with ground-truth keypoints:

Humans text to image with generated keypoints:

####Citation

If you find this useful, please cite our work as follows:

@inproceedings{reed2016learning,
  title={Learning What and Where to Draw},
  author={Scott Reed and Zeynep Akata and Santosh Mohan and Samuel Tenka and Bernt Schiele and Honglak Lee},
  booktitle={Advances in Neural Information Processing Systems},
  year={2016}
}

Learning What and Where to Draw

Related tags

Overview

Owner

Scott Ellison Reed

Official code for the CVPR 2022 (oral) paper "Extracting Triangular 3D Models, Materials, and Lighting From Images".

BigbrotherBENL - Face recognition on the Big Brother episodes in Belgium and the Netherlands.

frida工具的缝合怪

This is an official implementation of the paper "Distance-aware Quantization", accepted to ICCV2021.

Shuffle Attention for MobileNetV3

Code for ICE-BeeM paper - NeurIPS 2020

Rewrite ultralytics/yolov5 v6.0 opencv inference code based on numpy, no need to rely on pytorch

Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data

PyTorch code for our ECCV 2018 paper "Image Super-Resolution Using Very Deep Residual Channel Attention Networks"

The official repository for "Score Transformer: Generating Musical Scores from Note-level Representation" (MMAsia '21)

BoxInst: High-Performance Instance Segmentation with Box Annotations

Source code for CVPR2022 paper "Abandoning the Bayer-Filter to See in the Dark"

The code for two papers: Feedback Transformer and Expire-Span.

GLM (General Language Model)

Official Chainer implementation of GP-GAN: Towards Realistic High-Resolution Image Blending (ACMMM 2019, oral)

一些经典的CTR算法的复现; LR, FM, FFM, AFM, DeepFM，xDeepFM, PNN, DCN, DCNv2, DIFM, AutoInt, FiBiNet,AFN,ONN,DIN, DIEN ... （pytorch, tf2.0）

Enabling Lightweight Fine-tuning for Pre-trained Language Model Compression based on Matrix Product Operators

A library for finding knowledge neurons in pretrained transformer models.

Official implementation for paper: A Latent Transformer for Disentangled Face Editing in Images and Videos.

Memory Defense: More Robust Classificationvia a Memory-Masking Autoencoder