[SIGGRAPH 2022 Journal Track] AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars

Overview

AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars

1S-Lab, Nanyang Technological University  2SenseTime Research  3Shanghai AI Laboratory
*equal contribution  +corresponding author

Accepted to SIGGRAPH 2022 (Journal Track)

TL;DR

AvatarCLIP generate and animate avatars given descriptions of body shapes, appearances and motions.

A tall and skinny female soldier that is arguing. A skinny ninja that is raising both arms. An overweight sumo wrestler that is sitting. A tall and fat Iron Man that is running.

This repository contains the official implementation of AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars.


[Project Page][arXiv][High-Res PDF (166M)][Supplementary Video][Colab Demo]

Updates

[05/2022] Paper uploaded to arXiv. arXiv

[05/2022] Add a Colab Demo for avatar generation! Open In Colab

[05/2022] Support converting the generated avatar to the animatable FBX format! Go checkout how to use the FBX models. Or checkout the instructions for the conversion codes.

[05/2022] Code release for avatar generation part!

[04/2022] AvatarCLIP is accepted to SIGGRAPH 2022 (Journal Track) 🥳 !

Citation

If you find our work useful for your research, please consider citing the paper:

@article{hong2022avatarclip,
    title={AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars},
    author={Hong, Fangzhou and Zhang, Mingyuan and Pan, Liang and Cai, Zhongang and Yang, Lei and Liu, Ziwei},
    journal={ACM Transactions on Graphics (TOG)},
    volume={41},
    number={4},
    articleno={161},
    pages={1--19},
    year={2022},
    publisher={ACM New York, NY, USA},
    doi={10.1145/3528223.3530094},
}

Use Generated FBX Models

Download

Go visit our project page. Go to the section 'Avatar Gallery'. Pick a model you like. Click 'Load Model' below. Click 'Download FBX' link at the bottom of the pop-up viewer.

Import to Your Favourite 3D Software (e.g. Blender, Unity3D)

The FBX models are already rigged. Use your motion library to animate it!

Upload to Mixamo

To make use of the rich motion library provided by Mixamo, you can also upload the FBX model to Mixamo. The rigging process is completely automatic!

Installation

We recommend using anaconda to manage the python environment. The setup commands below are provided for your reference.

git clone https://github.com/hongfz16/AvatarCLIP.git
cd AvatarCLIP
conda create -n AvatarCLIP python=3.7
conda activate AvatarCLIP
conda install pytorch==1.7.0 torchvision==0.8.0 torchaudio==0.7.0 cudatoolkit=10.1 -c pytorch
pip install -r requirements.txt

Other than the above steps, you should also install neural_renderer following its instructions. Before compiling neural_renderer (or after compiling should also be fine), remember to add the following three lines to neural_renderer/perspective.py after line 19.

x[z<=0] = 0
y[z<=0] = 0
z[z<=0] = 0

This quick fix is for a rendering issue where objects behide the camera will also be rendered. Be careful when using this fixed version of neural_renderer on your other projects, because this fix will cause the rendering process not differentiable.

Data Preparation

Download SMPL Models

Register and download SMPL models here. Put the downloaded models in the folder smpl_models. The folder structure should look like

./
├── ...
└── smpl_models/
    ├── smpl/
        ├── SMPL_FEMALE.pkl
        ├── SMPL_MALE.pkl
        └── SMPL_NEUTRAL.pkl

Download Pretrained Models & Other Data

This download is only for coarse shape generation. You can skip if you only want to use other parts. Download the pretrained weights and other required data here. Put them in the folder AvatarGen so that the folder structure should look like

./
├── ...
└── AvatarGen/
    └── ShapeGen/
        └── data/
            ├── codebook.pth
            ├── model_VAE_16.pth
            ├── nongrey_male_0110.jpg
            ├── smpl_uv.mtl
            └── smpl_uv.obj

Avatar Generation

Coarse Shape Generation

Folder AvatarGen/ShapeGen contains codes for this part. Run the follow command to generate the coarse shape corresponding to the shape description 'a strong man'. We recommend to use the prompt augmentation 'a 3d rendering of xxx in unreal engine' for better results. The generated coarse body mesh will be stored under AvatarGen/ShapeGen/output/coarse_shape.

python main.py --target_txt 'a 3d rendering of a strong man in unreal engine'

Then we need to render the mesh for initialization of the implicit avatar representation. Use the following command for rendering.

python render.py --coarse_shape_obj output/coarse_shape/a_3d_rendering_of_a_strong_man_in_unreal_engine.obj --output_folder ${RENDER_FOLDER}

Shape Sculpting and Texture Generation

Note that all the codes are tested on NVIDIA V100 (32GB memory). Therefore, in order to run on GPUs with lower memory, please try to scale down the network or tune down max_ray_num in the config files. You can refer to confs/examples_small/example.conf or our colab demo for a scale-down version of AvatarCLIP.

Folder AvatarGen/AppearanceGen contains codes for this part. We provide data, pretrained model and scripts to perform shape sculpting and texture generation on a zero-beta body (mean shape defined by SMPL). We provide many example scripts under AvatarGen/AppearanceGen/confs/examples. For example, if we want to generate 'Abraham Lincoln', which is defined in the config file confs/examples/abrahamlincoln.conf, use the following command.

python main.py --mode train_clip --conf confs/examples/abrahamlincoln.conf

Results will be stored in AvatarCLIP/AvatarGen/AppearanceGen/exp/smpl/examples/abrahamlincoln.

If you wish to perform shape sculpting and texture generation on the previously generated coarse shape. We also provide example config files in confs/base_models/astrongman.conf confs/astrongman/*.conf. Two steps of optimization are required as follows.

# Initilization of the implicit avatar
python main.py --mode train --conf confs/base_models/astrongman.conf
# Shape sculpting and texture generation on the initialized implicit avatar
python main.py --mode train_clip --conf confs/astrongman/hulk.conf

Marching Cube

To extract meshes from the generated implicit avatar, one may use the following command.

python main.py --mode validate_mesh --conf confs/examples/abrahamlincoln.conf

The final high resolution mesh will be stored as AvatarCLIP/AvatarGen/AppearanceGen/exp/smpl/examples/abrahamlincoln/meshes/00030000.ply

Convert Avatar to FBX Format

For the convenience of using the generated avatar with modern graphics pipeline, we also provide scripts to rig the avatar and convert to FBX format. See the instructions here.

Motion Generation

TBA

License

Distributed under the MIT License. See LICENSE for more information.

Related Works

There are lots of wonderful works that inspired our work or came around the same time as ours.

Dream Fields enables zero-shot text-driven general 3D object generation using CLIP and NeRF.

Text2Mesh proposes to edit a template mesh by predicting offsets and colors per vertex using CLIP and differentiable rendering.

CLIP-NeRF can manipulate 3D objects represented by NeRF with natural languages or examplar images by leveraging CLIP.

Text to Mesh facilitates zero-shot text-driven general mesh generation by deforming from a sphere mesh guided by CLIP.

MotionCLIP establishes a projection from the CLIP text space to the motion space through supervised training, which leads to amazing text-driven motion generation results.

Acknowledgements

This study is supported by NTU NAP, MOE AcRF Tier 2 (T2EP20221-0033), and under the RIE2020 Industry Alignment Fund – Industry Collaboration Projects (IAF-ICP) Funding Initiative, as well as cash and in-kind contribution from the industry partner(s).

We thank the following repositories for their contributions in our implementation: NeuS, smplx, vposer, Smplx2FBX.

Power Core Simulator!

Power Core Simulator Power Core Simulator is a simulator based off the Roblox game "Pinewood Builders Computer Core". In this simulator, you can choos

BananaJeans 1 Nov 13, 2021
SOTA model in CIFAR10

A PyTorch Implementation of CIFAR Tricks 调研了CIFAR10数据集上各种trick,数据增强,正则化方法,并进行了实现。目前项目告一段落,如果有更好的想法,或者希望一起维护这个项目可以提issue或者在我的主页找到我的联系方式。 0. Requirement

PJDong 58 Dec 21, 2022
Adversarial Attacks are Reversible via Natural Supervision

Adversarial Attacks are Reversible via Natural Supervision ICCV2021 Citation @InProceedings{Mao_2021_ICCV, author = {Mao, Chengzhi and Chiquier

Computer Vision Lab at Columbia University 20 May 22, 2022
NanoDet-Plus⚡Super fast and lightweight anchor-free object detection model. 🔥Only 980 KB(int8) / 1.8MB (fp16) and run 97FPS on cellphone🔥

NanoDet-Plus⚡Super fast and lightweight anchor-free object detection model. 🔥Only 980 KB(int8) / 1.8MB (fp16) and run 97FPS on cellphone🔥

4.8k Jan 07, 2023
Weakly Supervised Text-to-SQL Parsing through Question Decomposition

Weakly Supervised Text-to-SQL Parsing through Question Decomposition The official repository for the paper "Weakly Supervised Text-to-SQL Parsing thro

14 Dec 19, 2022
An open source machine learning library for performing regression tasks using RVM technique.

Introduction neonrvm is an open source machine learning library for performing regression tasks using RVM technique. It is written in C programming la

Siavash Eliasi 33 May 31, 2022
A PyTorch implementation of "Signed Graph Convolutional Network" (ICDM 2018).

SGCN ⠀ A PyTorch implementation of Signed Graph Convolutional Network (ICDM 2018). Abstract Due to the fact much of today's data can be represented as

Benedek Rozemberczki 251 Nov 30, 2022
Source code for "OmniPhotos: Casual 360° VR Photography"

OmniPhotos: Casual 360° VR Photography Project Page | Video | Paper | Demo | Data This repository contains the source code for creating and viewing Om

Christian Richardt 144 Dec 30, 2022
Code for paper "Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs"

This is the codebase for the paper: Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs Directory Structur

Peter Hase 19 Aug 21, 2022
NAS-FCOS: Fast Neural Architecture Search for Object Detection (CVPR 2020)

NAS-FCOS: Fast Neural Architecture Search for Object Detection This project hosts the train and inference code with pretrained model for implementing

Ning Wang 180 Dec 06, 2022
unofficial pytorch implementation of RefineGAN

RefineGAN unofficial pytorch implementation of RefineGAN (https://arxiv.org/abs/1709.00753) for CSMRI reconstruction, the official code using tensorpa

xinby17 5 Jul 21, 2022
KaziText is a tool for modelling common human errors.

KaziText KaziText is a tool for modelling common human errors. It estimates probabilities of individual error types (so called aspects) from grammatic

ÚFAL 3 Nov 24, 2022
This is the official implement of paper "ActionCLIP: A New Paradigm for Action Recognition"

This is an official pytorch implementation of ActionCLIP: A New Paradigm for Video Action Recognition [arXiv] Overview Content Prerequisites Data Prep

268 Jan 09, 2023
CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Detection in Remote Sensing Images

CFC-Net This project hosts the official implementation for the paper: CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Dete

ming71 55 Dec 12, 2022
Official PyTorch implementation for paper Context Matters: Graph-based Self-supervised Representation Learning for Medical Images

Context Matters: Graph-based Self-supervised Representation Learning for Medical Images Official PyTorch implementation for paper Context Matters: Gra

49 Nov 23, 2022
Meta-TTS: Meta-Learning for Few-shot SpeakerAdaptive Text-to-Speech

Meta-TTS: Meta-Learning for Few-shot SpeakerAdaptive Text-to-Speech This repository is the official implementation of "Meta-TTS: Meta-Learning for Few

Sung-Feng Huang 128 Dec 25, 2022
Definition of a business problem according to Wilson Lower Bound Score and Time Based Average Rating

Wilson Lower Bound Score, Time Based Rating Average In this study I tried to calculate the product rating and sorting reviews more accurately. I have

3 Sep 30, 2021
Pytorch implementation of U-Net, R2U-Net, Attention U-Net, and Attention R2U-Net.

pytorch Implementation of U-Net, R2U-Net, Attention U-Net, Attention R2U-Net U-Net: Convolutional Networks for Biomedical Image Segmentation https://a

leejunhyun 2k Jan 02, 2023
This repository contains the exercises and its solution contained in the book "An Introduction to Statistical Learning" in python.

An-Introduction-to-Statistical-Learning This repository contains the exercises and its solution contained in the book An Introduction to Statistical L

2.1k Jan 02, 2023
This repository contains the data and code for the paper "Diverse Text Generation via Variational Encoder-Decoder Models with Gaussian Process Priors" ([email protected])

GP-VAE This repository provides datasets and code for preprocessing, training and testing models for the paper: Diverse Text Generation via Variationa

Wanyu Du 18 Dec 29, 2022