Official Pytorch Implementation of 3DV2021 paper: SAFA: Structure Aware Face Animation.

Overview

SAFA: Structure Aware Face Animation (3DV2021)

Official Pytorch Implementation of 3DV2021 paper: SAFA: Structure Aware Face Animation.

Screenshot Screenshot Screenshot Screenshot

Screenshot

Getting Started

git clone https://github.com/Qiulin-W/SAFA.git

Installation

Python 3.6 or higher is recommended.

1. Install PyTorch3D

Follow the guidance from: https://github.com/facebookresearch/pytorch3d/blob/master/INSTALL.md.

2. Install Other Dependencies

To install other dependencies run:

pip install -r requirements.txt

Usage

1. Preparation

a. Download FLAME model, choose FLAME 2020 and unzip it, put generic_model.pkl under ./modules/data.

b. Download head_template.obj, landmark_embedding.npy, uv_face_eye_mask.png and uv_face_mask.png from DECA/data, and put them under ./module/data.

c. Download SAFA model checkpoint from Google Drive and put it under ./ckpt.

d. (Optional, required by the face swap demo) Download the pretrained face parser from face-parsing.PyTorch and put it under ./face_parsing/cp.

2. Demos

We provide demos for animation and face swap.

a. Animation demo

python animation_demo.py --config config/end2end.yaml --checkpoint path/to/checkpoint --source_image_pth path/to/source_image --driving_video_pth path/to/driving_video --relative --adapt_scale --find_best_frame

b. Face swap demo We adopt face-parsing.PyTorch for indicating the face regions in both the source and driving images.

For preprocessed source images and driving videos, run:

python face_swap_demo.py --config config/end2end.yaml --checkpoint path/to/checkpoint --source_image_pth path/to/source_image --driving_video_pth path/to/driving_video

For arbitrary images and videos, we use a face detector to detect and swap the corresponding face parts. Cropped images will be resized to 256*256 in order to fit to our model.

python face_swap_demo.py --config config/end2end.yaml --checkpoint path/to/checkpoint --source_image_pth path/to/source_image --driving_video_pth path/to/driving_video --use_detection

Training

We modify the distributed traininig framework used in that of the First Order Motion Model. Instead of using torch.nn.DataParallel (DP), we adopt torch.distributed.DistributedDataParallel (DDP) for faster training and more balanced GPU memory load. The training procedure is divided into two steps: (1) Pretrain the 3DMM estimator, (2) End-to-end Training.

3DMM Estimator Pre-training

CUDA_VISIBLE_DEVICES="0,1,2,3" python -m torch.distributed.launch --nproc_per_node 4 run_ddp.py --config config/pretrain.yaml

End-to-end Training

CUDA_VISIBLE_DEVICES="0,1,2,3" python -m torch.distributed.launch --nproc_per_node 4 run_ddp.py --config config/end2end.yaml --tdmm_checkpoint path/to/tdmm_checkpoint_pth

Evaluation / Inference

Video Reconstrucion

python run_ddp.py --config config/end2end.yaml --checkpoint path/to/checkpoint --mode reconstruction

Image Animation

python run_ddp.py --config config/end2end.yaml --checkpoint path/to/checkpoint --mode animation

3D Face Reconstruction

python tdmm_inference.py --data_dir directory/to/images --tdmm_checkpoint path/to/tdmm_checkpoint_pth

Dataset and Preprocessing

We use VoxCeleb1 to train and evaluate our model. Original Youtube videos are downloaded, cropped and splited following the instructions from video-preprocessing.

a. To obtain the facial landmark meta data from the preprocessed videos, run:

python video_ldmk_meta.py --video_dir directory/to/preprocessed_videos out_dir directory/to/output_meta_files

b. (Optional) Extract images from videos for 3DMM pretraining:

python extract_imgs.py

Citation

If you find our work useful to your research, please consider citing:

@article{wang2021safa,
  title={SAFA: Structure Aware Face Animation},
  author={Wang, Qiulin and Zhang, Lu and Li, Bo},
  journal={arXiv preprint arXiv:2111.04928},
  year={2021}
}

License

Please refer to the LICENSE file.

Acknowledgement

Here we provide the list of external sources that we use or adapt from:

  1. Codes are heavily borrowed from First Order Motion Model, LICENSE.
  2. Some codes are also borrowed from: a. FLAME_PyTorch, LICENSE b. generative-inpainting-pytorch, LICENSE c. face-parsing.PyTorch, LICENSE d. video-preprocessing.
  3. We adopt FLAME model resources from: a. DECA, LICENSE b. FLAME, LICENSE
  4. External Libaraies: a. PyTorch3D, LICENSE b. face-alignment, LICENSE
Owner
QiulinW
MSc at Imperial College London, now working at JD Technology.
QiulinW
Diverse Image Captioning with Context-Object Split Latent Spaces (NeurIPS 2020)

Diverse Image Captioning with Context-Object Split Latent Spaces This repository is the PyTorch implementation of the paper: Diverse Image Captioning

Visual Inference Lab @TU Darmstadt 34 Nov 21, 2022
Dataset and Source code of paper 'Enhancing Keyphrase Extraction from Academic Articles with their Reference Information'.

Enhancing Keyphrase Extraction from Academic Articles with their Reference Information Overview Dataset and code for paper "Enhancing Keyphrase Extrac

15 Nov 24, 2022
Husein pet projects in here!

project-suka-suka Husein pet projects in here! List of projects mysejahtera-density. Generate resolution points using meshgrid and request each points

HUSEIN ZOLKEPLI 47 Dec 09, 2022
Registration Loss Learning for Deep Probabilistic Point Set Registration

RLLReg This repository contains a Pytorch implementation of the point set registration method RLLReg. Details about the method can be found in the 3DV

Felix Järemo Lawin 35 Nov 02, 2022
TensorFlow2 Classification Model Zoo playing with TensorFlow2 on the CIFAR-10 dataset.

Training CIFAR-10 with TensorFlow2(TF2) TensorFlow2 Classification Model Zoo. I'm playing with TensorFlow2 on the CIFAR-10 dataset. Architectures LeNe

Chia-Hung Yuan 16 Sep 27, 2022
Space robot - (Course Project) Using the space robot to capture the target satellite that is disabled and spinning, then stabilize and fix it up

Space robot - (Course Project) Using the space robot to capture the target satellite that is disabled and spinning, then stabilize and fix it up

Mingrui Yu 3 Jan 07, 2022
PlenOctree Extraction algorithm

PlenOctrees_NeRF-SH This is an implementation of the Paper PlenOctrees for Real-time Rendering of Neural Radiance Fields. Not only the code provides t

49 Nov 05, 2022
Scheduling BilinearRewards

Scheduling_BilinearRewards Requirement Python 3 =3.5 Structure main.py This file includes the main function. For getting the results in Figure 1, ple

junghun.kim 0 Nov 25, 2021
[CVPR 2016] Unsupervised Feature Learning by Image Inpainting using GANs

Context Encoders: Feature Learning by Inpainting CVPR 2016 [Project Website] [Imagenet Results] Sample results on held-out images: This is the trainin

Deepak Pathak 829 Dec 31, 2022
This repository provides the code for MedViLL(Medical Vision Language Learner).

MedViLL This repository provides the code for MedViLL(Medical Vision Language Learner). Our proposed architecture MedViLL is a single BERT-based model

SuperSuperMoon 39 Jan 05, 2023
GNN-based Recommendation Benchmark

GRecX A Fair Benchmark for GNN-based Recommendation Homepage and Documentation Homepage: Documentation: Paper: GRecX: An Efficient and Unified Benchma

73 Oct 17, 2022
Language model Prompt And Query Archive

LPAQA: Language model Prompt And Query Archive This repository contains data and code for the paper How Can We Know What Language Models Know? Install

127 Dec 20, 2022
Build Low Code Automated Tensorflow, What-IF explainable models in just 3 lines of code.

Build Low Code Automated Tensorflow explainable models in just 3 lines of code.

Hasan Rafiq 170 Dec 26, 2022
Adjust Decision Boundary for Class Imbalanced Learning

Adjusting Decision Boundary for Class Imbalanced Learning This repository is the official PyTorch implementation of WVN-RS, introduced in Adjusting De

Peyton Byungju Kim 16 Jan 04, 2023
(Preprint) Official PyTorch implementation of "How Do Vision Transformers Work?"

(Preprint) Official PyTorch implementation of "How Do Vision Transformers Work?"

xxxnell 656 Dec 30, 2022
[AAAI-2022] Official implementations of MCL: Mutual Contrastive Learning for Visual Representation Learning

Mutual Contrastive Learning for Visual Representation Learning This project provides source code for our Mutual Contrastive Learning for Visual Repres

winycg 48 Jan 02, 2023
Leaderboard and Visualization for RLCard

RLCard Showdown This is the GUI support for the RLCard project and DouZero project. RLCard-Showdown provides evaluation and visualization tools to hel

Data Analytics Lab at Texas A&M University 246 Dec 26, 2022
Package for extracting emotions from social media text. Tailored for financial data.

EmTract: Extracting Emotions from Social Media Text Tailored for Financial Contexts EmTract is a tool that extracts emotions from social media text. I

13 Nov 17, 2022
Myia prototyping

Myia Myia is a new differentiable programming language. It aims to support large scale high performance computations (e.g. linear algebra) and their g

Mila 456 Nov 07, 2022
Streamlit component for TensorBoard, TensorFlow's visualization toolkit

streamlit-tensorboard This is a work-in-progress, providing a function to embed TensorBoard, TensorFlow's visualization toolkit, in Streamlit apps. In

Snehan Kekre 27 Nov 13, 2022