3D Generative Adversarial Network

Overview

Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling

This repository contains pre-trained models and sampling code for the 3D Generative Adversarial Network (3D-GAN) presented at NIPS 2016.

http://3dgan.csail.mit.edu

Prerequisites

Torch

We use Torch 7 (http://torch.ch) for our implementation with these additional packages:

Visualization

  • Basic visualization: MATLAB (tested on R2016b)
  • Advanced visualization: Python 2.7 with package numpy, matplotlib, scipy and vtk (version 5.10.1)

Note: for advanced visualization, the version of vtk has to be 5.10.1, not above. It is available in the package list of common Python distributions like Anaconda

Installation

Our current release has been tested on Ubuntu 14.04.

Cloning the repository

git clone [email protected]:zck119/3dgan-release.git
cd 3dgan-release

Downloading pretrained models

For CPU (947 MB):

./download_models_cpu.sh

For GPU (618 MB):

./download_models_gpu.sh

Downloading latent vector inputs for demo

./download_demo_inputs.sh

Guide

Synthesizing shapes (main.lua)

We show how to synthesize shapes with our pre-trained models. The file (main.lua) has the following options.

  • -gpu ID: GPU ID (starting from 1). Set to 0 to use CPU only.
  • -class CLASSNAME: synthesize shapes for the class CLASSNAME. We currently support five classes (car, chair, desk, gun, and sofa). Use all to generate shapes for each class.
  • -sample: whether to sample input latent vectors from an i.i.d. uniform distribution, or to generate shapes with demo vectors loaded from ./demo_inputs/CLASSNAME.mat
  • -bs BATCH_SIZE: use batch size of BATCH_SIZE during network forwarding
  • -ss SAMPLE_SIZE: set the number of generated shapes to SAMPLE_SIZE. This option is only available in -sample mode.

Usages include

  • Synthesize chairs with pre-sampled demo inputs and a CPU
th main.lua -gpu 0 -class chair 
  • Randomly sample 150 desks with GPU 1 and a batch size of 50
th main.lua -gpu 1 -class desk -bs 50 -sample -ss 150 
  • Randomly sample 150 shapes of each category with GPU 1 and a batch size of 50
th main.lua -gpu 1 -class all -bs 50 -sample -ss 150 

The output is saved under folder ./output, with class_name_demo.mat for shapes generated by predetermined demo inputs (z in our paper), and class_name_sample.mat for randomly sampled 3D shapes. The variable inputs in the .mat file correponds to the input latent representation, and the variable voxels corresponds to the generated 3D shapes by our network.

Visualization

We offer two ways of visualizing results, one in MATLAB and the other in Python. We used the Python visualization in our paper. The MATLAB visualization is easier to install and run, but its output has a lower quality compared with the Python one.

MATLAB: Please use the function visualization/matlab/visualize.m for visualization. The MATLAB code allows users to either display rendered objects or save them as images. The script also supports downsampling and thresholding for faster rendering. The color of voxels represents the confidence value.

Options include

  • inputfile: the .mat file that saves the voxel matrices
  • indices: the indices of objects in the inputfile that should be rendered. The default value is 0, which stands for rendering all objects.
  • step (s): downsample objects via a max pooling of step s for efficiency. The default value is 4 (64 x 64 x 64 -> 16 x 16 x 16).
  • threshold: voxels with confidence lower than the threshold are not displayed
  • outputprefix:
    • when not specified, Matlab shows figures directly.
    • when specified, Matlab stores rendered images of objects at outputprefix_%i.bmp, where %i is the index of objects

Usage (after running th main.lua -gpu 0 -class chair, in MATLAB, in folder visualization/matlab):

visualize('../../output/chair_demo.mat', 0, 2, 0.1, 'chair')

The visualization might take a while. The obtained rendering (chair_1/3/4/5.bmp) should look as follows.

Python: Options for the Python visualization include

  • -t THRESHOLD: voxels with confidence lower than the threshold are not displayed. The default value is 0.1.
  • -i ID: the index of objects in the inputfile that should be rendered (one based). The default value is 1.
  • -df STEPSIZE: downsample objects via a max pooling of step STEPSIZE for efficiency. Currently supporting STEPSIZE 1, 2, and 4. The default value is 1 (i.e. no downsampling).
  • -dm METHOD: downsample method, where mean stands for average pooling and max for max pooling. The default is max pooling.
  • -u BLOCK_SIZE: set the size of the voxels to BLOCK_SIZE. The default value is 0.9.
  • -cm: whether to use a colormap to represent voxel occupancy, or to use a uniform color
  • -mc DISTANCE: whether to keep only the maximal connected component, where voxels of distance no larger than DISTANCE are considered connected. Set to 0 to disable this function. The default value is 3.

Usage:

python visualize.py chair_demo.mat -u 0.9 -t 0.1 -i 1 -mc 2

Reference

@inproceedings{3dgan,
  title={{Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling}},
  author={Wu, Jiajun and Zhang, Chengkai and Xue, Tianfan and Freeman, William T and Tenenbaum, Joshua B},
  booktitle={Advances In Neural Information Processing Systems},
  pages={82--90},
  year={2016}
}

For any questions, please contact Jiajun Wu ([email protected]) and Chengkai Zhang ([email protected]).

YOLOv5 🚀 is a family of object detection architectures and models pretrained on the COCO dataset

YOLOv5 🚀 is a family of object detection architectures and models pretrained on the COCO dataset, and represents Ultralytics open-source research int

阿才 73 Dec 16, 2022
Contrastive unpaired image-to-image translation, faster and lighter training than cyclegan (ECCV 2020, in PyTorch)

Contrastive Unpaired Translation (CUT) video (1m) | video (10m) | website | paper We provide our PyTorch implementation of unpaired image-to-image tra

1.7k Dec 27, 2022
Plaything for Autistic Children (demo for PaddlePaddle/Wechaty/Mixlab project)

星星的孩子 - 一款为孤独症孩子设计的聊天机器人游戏 孤独症儿童是目前常常被忽视的一类群体。他们有着类似性格内向的特征,实际却受着广泛性发育障碍的折磨。 项目背景 这类儿童在与人交往时存在着沟通障碍,其特点表现在: 社交交流差,互动障碍明显 认知能力有限,被动认知 兴趣狭窄,重复刻板,缺乏变化和想象

Tianyi Pan 35 Nov 24, 2022
Reproduce ResNet-v2(Identity Mappings in Deep Residual Networks) with MXNet

Reproduce ResNet-v2 using MXNet Requirements Install MXNet on a machine with CUDA GPU, and it's better also installed with cuDNN v5 Please fix the ran

Wei Wu 531 Dec 04, 2022
Optimizing Deeper Transformers on Small Datasets

DT-Fixup Optimizing Deeper Transformers on Small Datasets Paper published in ACL 2021: arXiv Detailed instructions to replicate our results in the pap

16 Nov 14, 2022
neural image generation

pixray Pixray is an image generation system. It combines previous ideas including: Perception Engines which uses image augmentation and iteratively op

dribnet 398 Dec 17, 2022
1st Solution For NeurIPS 2021 Competition on ML4CO Dual Task

KIDA: Knowledge Inheritance in Data Aggregation This project releases our 1st place solution on NeurIPS2021 ML4CO Dual Task. Slide and model weights a

MEGVII Research 24 Sep 08, 2022
ICML 21 - Voice2Series: Reprogramming Acoustic Models for Time Series Classification

Voice2Series-Reprogramming Voice2Series: Reprogramming Acoustic Models for Time Series Classification International Conference on Machine Learning (IC

49 Jan 03, 2023
A novel method to tune language models. Codes and datasets for paper ``GPT understands, too''.

P-tuning A novel method to tune language models. Codes and datasets for paper ``GPT understands, too''. How to use our code We have released the code

THUDM 562 Dec 27, 2022
Code for ACL2021 long paper: Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases

LANKA This is the source code for paper: Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases (ACL 2021, long paper) Referen

Boxi Cao 30 Oct 24, 2022
Poplar implementation of "Bundle Adjustment on a Graph Processor" (CVPR 2020)

Poplar Implementation of Bundle Adjustment using Gaussian Belief Propagation on Graphcore's IPU Implementation of CVPR 2020 paper: Bundle Adjustment o

Joe Ortiz 34 Dec 05, 2022
Code for the paper "Combining Textual Features for the Detection of Hateful and Offensive Language"

The repository provides the source code for the paper "Combining Textual Features for the Detection of Hateful and Offensive Language" submitted to HA

Sherzod Hakimov 3 Aug 04, 2022
🔥RandLA-Net in Tensorflow (CVPR 2020, Oral & IEEE TPAMI 2021)

RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds (CVPR 2020) This is the official implementation of RandLA-Net (CVPR2020, Oral

Qingyong 1k Dec 30, 2022
Implementation of ProteinBERT in Pytorch

ProteinBERT - Pytorch (wip) Implementation of ProteinBERT in Pytorch. Original Repository Install $ pip install protein-bert-pytorch Usage import torc

Phil Wang 92 Dec 25, 2022
This is the code repository implementing the paper "TreePartNet: Neural Decomposition of Point Clouds for 3D Tree Reconstruction".

TreePartNet This is the code repository implementing the paper "TreePartNet: Neural Decomposition of Point Clouds for 3D Tree Reconstruction". Depende

刘彦超 34 Nov 30, 2022
Drone-based Joint Density Map Estimation, Localization and Tracking with Space-Time Multi-Scale Attention Network

DroneCrowd Paper Detection, Tracking, and Counting Meets Drones in Crowds: A Benchmark. Introduction This paper proposes a space-time multi-scale atte

VisDrone 98 Nov 16, 2022
U-Time: A Fully Convolutional Network for Time Series Segmentation

U-Time & U-Sleep Official implementation of The U-Time [1] model for general-purpose time-series segmentation. The U-Sleep [2] model for resilient hig

Mathias Perslev 176 Dec 19, 2022
Robotics with GPU computing

Robotics with GPU computing Cupoch is a library that implements rapid 3D data processing for robotics using CUDA. The goal of this library is to imple

Shirokuma 625 Jan 07, 2023
Code for BMVC2021 paper "Boundary Guided Context Aggregation for Semantic Segmentation"

Boundary-Guided-Context-Aggregation Boundary Guided Context Aggregation for Semantic Segmentation Haoxiang Ma, Hongyu Yang, Di Huang In BMVC'2021 Pape

Haoxiang Ma 31 Jan 08, 2023
HW3 ― GAN, ACGAN and UDA

HW3 ― GAN, ACGAN and UDA In this assignment, you are given datasets of human face and digit images. You will need to implement the models of both GAN

grassking100 1 Dec 13, 2021