Deep Convolutional Generative Adversarial Networks

Overview

Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks

Alec Radford, Luke Metz, Soumith Chintala

All images in this paper are generated by a neural network. They are NOT REAL.

Full paper here: http://arxiv.org/abs/1511.06434

###Other implementations of DCGAN

##Summary of DCGAN We

  • stabilize Generative Adversarial networks with some architectural constraints
    • Replace any pooling layers with strided convolutions (discriminator) and fractional-strided convolutions (generator).
    • Use batchnorm in both the generator and the discriminator
    • Remove fully connected hidden layers for deeper architectures. Just use average pooling at the end.
    • Use ReLU activation in generator for all layers except for the output, which uses Tanh.
    • Use LeakyReLU activation in the discriminator for all layers.
  • use the discriminator as a pre-trained net for CIFAR-10 classification and show pretty decent results.
  • generate really cool bedroom images that look super real
  • To convince you that the network is not cheating:
    • show the interpolated latent space, where transitions are really smooth and every image in the latent space is a bedroom.
    • show bedrooms after one epoch of training (with a 0.0002 learning rate), come on the network cant really memorize at this stage.
  • To explore what the representations that the network learnt,
    • show deconvolution over the filters, to show that maximal activations occur at objects like windows and beds
    • figure out a way to identify and remove filters that draw windows in generation.
      • Now you can control the generator to not output certain objects.
  • Because we are tripping
    • Smiling woman - neutral woman + neutral man = Smiling man. Whuttttt!
    • man with glasses - man without glasses + woman without glasses = woman with glasses. Omg!!!!
  • learnt a latent space in a completely unsupervised fashion where ROTATIONS ARE LINEAR in this latent space. WHHHAAATT????!!!!!!
  • Figure 11, trained on imagenet has a plane with bird legs. so cooool.

Bedrooms after 5 epochs

Generated bedrooms after five epochs of training. There appears to be evidence of visual under-fitting via repeated textures across multiple samples.

Bedrooms after 1 epoch

Generated bedrooms after one training pass through the dataset. Theoretically, the model could learn to memorize training examples, but this is experimentally unlikely as we train with a small learning rate and minibatch SGD. We are aware of no prior empirical evidence demonstrating memorization with SGD and a small learning rate in only one epoch.

Walking from one point to another in bedroom latent space

Interpolation between a series of 9 random points in Z show that the space learned has smooth transitions, with every image in the space plausibly looking like a bedroom. In the 6th row, you see a room without a window slowly transforming into a room with a giant window. In the 10th row, you see what appears to be a TV slowly being transformed into a window.

Forgetting to draw windows

Top row: un-modified samples from model. Bottom row: the same samples generated with dropping out ”window” filters. Some windows are removed, others are transformed into objects with similar visual appearance such as doors and mirrors. Although visual quality decreased, overall scene composition stayed similar, suggesting the generator has done a good job disentangling scene representation from object representation. Extended experiments could be done to remove other objects from the image and modify the objects the generator draws.

Google image search from generations

Arithmetic on faces

Rotations are linear in latent space

More faces

Album covers

Imagenet generations

Owner
Alec Radford
Alec Radford
Deep Learning Based Fasion Recommendation System for Ecommerce

Project Name: Fasion Recommendation System for Ecommerce A Deep learning based streamlit web app which can recommened you various types of fasion prod

BAPPY AHMED 13 Dec 13, 2022
Official implementation of FCL-taco2: Fast, Controllable and Lightweight version of Tacotron2 @ ICASSP 2021

FCL-Taco2: Towards Fast, Controllable and Lightweight Text-to-Speech synthesis (ICASSP 2021) Paper | Demo Block diagram of FCL-taco2, where the decode

Disong Wang 39 Sep 28, 2022
Gapmm2: gapped alignment using minimap2 (align transcripts to genome)

gapmm2: gapped alignment using minimap2 This tool is a wrapper for minimap2 to r

Jon Palmer 2 Jan 27, 2022
CPPE - 5 (Medical Personal Protective Equipment) is a new challenging object detection dataset

CPPE - 5 CPPE - 5 (Medical Personal Protective Equipment) is a new challenging dataset with the goal to allow the study of subordinate categorization

Rishit Dagli 53 Dec 17, 2022
Classical OCR DCNN reproduction based on PaddlePaddle framework.

Paddle-SVHN Classical OCR DCNN reproduction based on PaddlePaddle framework. This project reproduces Multi-digit Number Recognition from Street View I

1 Nov 12, 2021
Code for "Primitive Representation Learning for Scene Text Recognition" (CVPR 2021)

Primitive Representation Learning Network (PREN) This repository contains the code for our paper accepted by CVPR 2021 Primitive Representation Learni

Ruijie Yan 76 Jan 02, 2023
Happywhale - Whale and Dolphin Identification Silver🥈 Solution (26/1588)

Kaggle-Happywhale Happywhale - Whale and Dolphin Identification Silver 🥈 Solution (26/1588) 竞赛方案思路 图像数据预处理-标志性特征图片裁剪:首先根据开源的标注数据训练YOLOv5x6目标检测模型,将训练集

Franxx 20 Nov 14, 2022
3D-Transformer: Molecular Representation with Transformer in 3D Space

3D-Transformer: Molecular Representation with Transformer in 3D Space

55 Dec 19, 2022
Position detection system of mobile robot in the warehouse enviroment

Autonomous-Forklift-System About | GUI | Tests | Starting | License | Author | 🎯 About An application that run the autonomous forklift paletization a

Kamil Goś 1 Nov 24, 2021
BEAS: Blockchain Enabled Asynchronous & Secure Federated Machine Learning

BEAS Blockchain Enabled Asynchronous and Secure Federated Machine Learning Default Network Configuration: The default application uses the HyperLedger

Harpreet Virk 11 Nov 20, 2022
The Implicit Bias of Gradient Descent on Generalized Gated Linear Networks

The Implicit Bias of Gradient Descent on Generalized Gated Linear Networks This folder contains the code to reproduce the data in "The Implicit Bias o

Samuel Lippl 0 Feb 05, 2022
Cognate Detection Repository

Cognate Detection Repository Details This repository contains the data for two publications: Challenge Dataset of Cognates and False Friend Pairs from

Diptesh Kanojia 1 Apr 26, 2022
Pytorch Geometric Tutorials

Pytorch Geometric Tutorials

Antonio Longa 648 Jan 08, 2023
The PASS dataset: pretrained models and how to get the data - PASS: Pictures without humAns for Self-Supervised Pretraining

The PASS dataset: pretrained models and how to get the data - PASS: Pictures without humAns for Self-Supervised Pretraining

Yuki M. Asano 249 Dec 22, 2022
First-Order Probabilistic Programming Language

FOPPL: A First-Order Probabilistic Programming Language This is an implementation of FOPPL, an S-expression based probabilistic programming language d

Renato Costa 23 Dec 20, 2022
A code generator from ONNX to PyTorch code

onnx-pytorch Generating pytorch code from ONNX. Currently support onnx==1.9.0 and torch==1.8.1. Installation From PyPI pip install onnx-pytorch From

Wenhao Hu 94 Jan 06, 2023
Compact Bidirectional Transformer for Image Captioning

Compact Bidirectional Transformer for Image Captioning Requirements Python 3.8 Pytorch 1.6 lmdb h5py tensorboardX Prepare Data Please use git clone --

YE Zhou 19 Dec 12, 2022
Face Mask Detection System built with OpenCV, TensorFlow using Computer Vision concepts

Face mask detection Face Mask Detection System built with OpenCV, TensorFlow using Computer Vision concepts in order to detect face masks in static im

Vaibhav Shukla 1 Oct 27, 2021
"Exploring Vision Transformers for Fine-grained Classification" at CVPRW FGVC8

FGVC8 Exploring Vision Transformers for Fine-grained Classification paper presented at the CVPR 2021, The Eight Workshop on Fine-Grained Visual Catego

Marcos V. Conde 19 Dec 06, 2022
Code for testing convergence rates of Lipschitz learning on graphs

📈 LipschitzLearningRates The code in this repository reproduces the experimental results on convergence rates for k-nearest neighbor graph infinity L

2 Dec 20, 2021