PyTorch implementation for NED. It can be used to manipulate the facial emotions of actors in videos based on emotion labels or reference styles.

Related tags

Deep LearningNED
Overview

Neural Emotion Director (NED) - Official Pytorch Implementation

Example video of facial emotion manipulation while retaining the original mouth motion, i.e. speech. We show examples of 3 basic emotions.



This repository contains the source code for our paper:

Neural Emotion Director: Speech-preserving semantic control of facial expressions in โ€œin-the-wildโ€ videos
Foivos Paraperas Papantoniou, Panagiotis P. Filntisis, Petros Maragos, Anastasios Roussos

Project site: https://foivospar.github.io/NED/

Abstract: In this paper, we introduce a novel deep learning method for photo-realistic manipulation of the emotional state of actors in ``in-the-wild'' videos. The proposed method is based on a parametric 3D face representation of the actor in the input scene that offers a reliable disentanglement of the facial identity from the head pose and facial expressions. It then uses a novel deep domain translation framework that alters the facial expressions in a consistent and plausible manner, taking into account their dynamics. Finally, the altered facial expressions are used to photo-realistically manipulate the facial region in the input scene based on an especially-designed neural face renderer. To the best of our knowledge, our method is the first to be capable of controlling the actorโ€™s facial expressions by even using as a sole input the semantic labels of the manipulated emotions, while at the same time preserving the speech-related lip movements. We conduct extensive qualitative and quantitative evaluations and comparisons, which demonstrate the effectiveness of our approach and the especially promising results that we obtain. Our method opens a plethora of new possibilities for useful applications of neural rendering technologies, ranging from movie post-production and video games to photo-realistic affective avatars.

Getting Started

Clone the repo:

git clone https://github.com/foivospar/NED
cd NED

Requirements

Create a conda environment, using the provided environment.yml file.

conda env create -f environment.yml

Activate the environment.

conda activate NED

Files

  1. Follow the instructions in DECA (under the Prepare data section) to acquire the 3 files ('generic_model.pkl', 'deca_model.tar', 'FLAME_albedo_from_BFM.npz') and place them under "./DECA/data".
  2. Fill out the form to get access to the FSGAN's pretrained models. Then download 'lfw_figaro_unet_256_2_0_segmentation_v1.pth' (from the "v1" folder) and place it under "./preprocessing/segmentation".

Video preprocessing

To train or test the method on a specific subject, first create a folder for this subject and place the video(s) of this subject into a "videos" subfolder. To acquire the training/test videos used in our experiments, please contact us.

For example, for testing the method on Tarantino's clip, a structure similar to the following must be created:

Tarantino ----- videos ----- Tarantino_t.mp4

Under the above structure, there are 3 options for the video(s) placed in the "videos" subfolder:

  1. Use it as test footage for this actor and apply our method for manipulating his/her emotion.
  2. Use this footage to train a neural face renderer on the actor (e.g. use the training video one of our 6 Youtube actors, or a footage of similar duration for a new identity).
  3. Use it only as reference clip for transferring the expressive style of the actor to another subject.

To preprocess the video (face detection, segmentation, landmark detection, 3D reconstruction, alignment) run:

./preprocess.sh <celeb_path> <mode>
  • is the path to the folder used for this actor.
  • is one of {train, test, reference} for each of the above cases respectively.

After successfull execution, the following structure will be created:


   
     ----- videos -----video.mp4 (e.g. "Tarantino_t.mp4")
                   |        |
                   |        ---video.txt (e.g. "Tarantino_t.txt", stores the per-frame bounding boxes, created only if mode=test)
                   |
                   --- images (cropped and resized images)
                   |
                   --- full_frames (original frames of the video, created only if mode=test or mode=reference)
                   |
                   --- eye_landmarks (created only if mode=train or mode=test)
                   |
                   --- eye_landmarks_aligned (same as above, but aligned)
                   |
                   --- align_transforms (similarity transformation matrices, created only if mode=train or mode=test)
                   |
                   --- faces (segmented images of the face, created only if mode=train or mode=test)
                   |
                   --- faces_aligned (same as above, but aligned)
                   |
                   --- masks (binary face masks, created only if mode=train or mode=test)
                   |
                   --- masks_aligned (same as above, but aligned)
                   |
                   --- DECA (3D face model parameters)
                   |
                   --- nmfcs (NMFC images, created only if mode=train or mode=test)
                   |
                   --- nmfcs_aligned (same as above, but aligned)
                   |
                   --- shapes (detailed shape images, created only if mode=train or mode=test)
                   |
                   --- shapes_aligned (same as above, but aligned)

   

1.Manipulate the emotion on a test video

Download our pretrained manipulator from here and unzip the checkpoint. We currently provide only the test scripts for the manipulator.

Also, preprocess the test video for one of our target Youtube actors or use a new actor (requires training a new neural face renderer).

For our Youtube actors, we provide pretrained renderer models here. Download the .zip file for the desired actor and unzip it.

Then, assuming that preprocessing (in test mode) has been performed for the selected test video (see above), you can manipulate the expressions of the celebrity in this video by one of the following 2 ways:

1.Label-driven manipulation

Select one of the 7 basic emotions (happy, angry, surprised, neutral, fear, sad, disgusted) and run :

python manipulator/test.py --celeb <celeb_path> --checkpoints_dir ./manipulator_checkpoints --trg_emotions <emotions> --exp_name <exp_name>
  • is the path to the folder used for this actor's test footage (e.g. "./Tarantino").
  • is one or more of the 7 emotions. If one emotion is given, e.g. --trg_emotions happy, all the video will be converted to happy, whereas for 2 or more emotions, such as --trg_emotions happy angry the first half of the video will be happy, the second half angry and so on.
  • is the name of the sub-folder that will be created under the for storing the results.
2.Reference-driven manipulation

In this case, the reference video should first be preprocessed (see above) in reference mode. Then run:

python manipulator/test.py --celeb <celeb_path> --checkpoints_dir ./manipulator_checkpoints --ref_dirs <ref_dirs> --exp_name <exp_name>
  • is the path to the folder used for this actor's test footage (e.g. "./Tarantino").
  • is one or more reference videos. In particular, the path to the "DECA" sublfolder has to be given. As with labels, more than one paths can be given, in which case the video will be transformed sequentially according to those reference styles.
  • is the name of the sub-folder that will be created under the for storing the results.

Then, run:

./postprocess.sh <celeb_path> <exp_name> <checkpoints_dir>
  • is the path to the test folder used for this actor.
  • is the name you have given to the experiment in the previous step.
  • is the path to the pretrained renderer for this actor (e.g. "./checkpoints_tarantino" for Tarantino).

This step performs neural rendering, un-alignment and blending of the modified faces. Finally, you should see the full_frames sub-folder into / . This contains the full frames of the video with the altered emotion. To convert them to video, run:

python postprocessing/images2video.py --imgs_path <full_frames_path> --out_path <out_path> --audio <original_video_path>
  • is the path to the full frames (e.g. "./Tarantino/happy/full_frames").
  • is the path for saving the video (e.g. "./Tarantino_happy.mp4").
  • is the path to the original video (e.g. "./Tarantino/videos/tarantino_t.mp4"). This argment is optional and is used to add the original audio to the generated video.

2.Train a neural face renderer for a new celebrity

Download our pretrained meta-renderer ("checkpoints_meta-renderer.zip") from the link above and unzip the checkpoints.

Assuming that the training video of the new actor has been preprocessed (in train mode) as described above, you can then finetune our meta-renderer on this actor by running:

python renderer/train.py --celeb <celeb_path> --checkpoints_dir <checkpoints_dir> --load_pretrain <pretrain_checkpoints> --which_epoch 15
  • is the path to the train folder used for the new actor.
  • is the new path where the checkpoints will be saved.
  • is the path with the checkpoints of the pretrained meta-renderer (e.g. "./checkpoints_meta-renderer")

3.Preprocess a reference video

If you want to use a reference clip (e.g. from a movie) of another actor to transfer his/her speaking style to your test actor, simply preprocess the reference actor's clip as described above (mode=reference) and follow the instructions on Reference-driven manipulation.

Citation

If you find this work useful for your research, please cite our paper.

@article{paraperas2021neural,
         title={Neural Emotion Director: Speech-preserving semantic control of facial expressions in "in-the-wild" videos}, 
         author={Paraperas Papantoniou, Foivos and Filntisis, Panagiotis P. and Maragos, Petros and Roussos, Anastasios},
         journal={arXiv preprint arXiv:2112.00585},
         year={2021}
}

Acknowledgements

We would like to thank the following great repositories that our code borrows from:

Colab notebook and additional materials for Python-driven analysis of redlining data in Philadelphia

RedliningExploration The Google Colaboratory file contained in this repository contains work inspired by a project on educational inequality in the Ph

Benjamin Warren 1 Jan 20, 2022
Consecutive-Subsequence - Simple software to calculate susequence with highest sum

Simple software to calculate susequence with highest sum This repository contain

Gbadamosi Farouk 1 Jan 31, 2022
CVPR '21: In the light of feature distributions: Moment matching for Neural Style Transfer

In the light of feature distributions: Moment matching for Neural Style Transfer (CVPR 2021) This repository provides code to recreate results present

Nikolai Kalischek 49 Oct 13, 2022
Goal of the project : Detecting Temporal Boundaries in Sign Language videos

MVA RecVis course final project : Goal of the project : Detecting Temporal Boundaries in Sign Language videos. Sign language automatic indexing is an

Loubna Ben Allal 6 Dec 21, 2022
Image augmentation library in Python for machine learning.

Augmentor is an image augmentation library in Python for machine learning. It aims to be a standalone library that is platform and framework independe

Marcus D. Bloice 4.8k Jan 07, 2023
Self-supervised Point Cloud Prediction Using 3D Spatio-temporal Convolutional Networks

Self-supervised Point Cloud Prediction Using 3D Spatio-temporal Convolutional Networks This is a Pytorch-Lightning implementation of the paper "Self-s

Photogrammetry & Robotics Bonn 111 Dec 06, 2022
Code for our method RePRI for Few-Shot Segmentation. Paper at http://arxiv.org/abs/2012.06166

Region Proportion Regularized Inference (RePRI) for Few-Shot Segmentation In this repo, we provide the code for our paper : "Few-Shot Segmentation Wit

Malik Boudiaf 138 Dec 12, 2022
Static Features Classifier - A static features classifier for Point-Could clusters using an Attention-RNN model

Static Features Classifier This is a static features classifier for Point-Could

ABDALKARIM MOHTASIB 1 Jan 25, 2022
[ICCV 2021] A Simple Baseline for Semi-supervised Semantic Segmentation with Strong Data Augmentation

[ICCV 2021] A Simple Baseline for Semi-supervised Semantic Segmentation with Strong Data Augmentation

CodingMan 45 Dec 12, 2022
Brax is a differentiable physics engine that simulates environments made up of rigid bodies, joints, and actuators

Brax is a differentiable physics engine that simulates environments made up of rigid bodies, joints, and actuators. It's also a suite of learning algorithms to train agents to operate in these enviro

Google 1.5k Jan 02, 2023
An index of recommendation algorithms that are based on Graph Neural Networks.

An index of recommendation algorithms that are based on Graph Neural Networks.

FIB LAB, Tsinghua University 564 Jan 07, 2023
A repository for interferometer controller code.

dses-interferometer-controller A repository for interferometer controller code, hardware, and simulations. See dses.science for more information on th

Eli Reed 1 Jan 17, 2022
Data and analysis code for an MS on SK VOC genomes phenotyping/neutralisation assays

Description Summary of phylogenomic methods and analyses used in "Immunogenicity of convalescent and vaccinated sera against clinical isolates of ance

Finlay Maguire 1 Jan 06, 2022
Accelerated NLP pipelines for fast inference on CPU and GPU. Built with Transformers, Optimum and ONNX Runtime.

Optimum Transformers Accelerated NLP pipelines for fast inference ๐Ÿš€ on CPU and GPU. Built with ๐Ÿค— Transformers, Optimum and ONNX runtime. Installatio

Aleksey Korshuk 115 Dec 16, 2022
Creating predictive checklists from data using integer programming.

Learning Optimal Predictive Checklists A Python package to learn simple predictive checklists from data subject to customizable constraints. For more

Healthy ML 5 Apr 19, 2022
A Flexible Generative Framework for Graph-based Semi-supervised Learning (NeurIPS 2019)

G3NN This repo provides a pytorch implementation for the 4 instantiations of the flexible generative framework as described in the following paper: A

Jiaqi Ma 14 Oct 11, 2022
Small repo describing how to use Hugging Face's Wav2Vec2 with PyCTCDecode

๐Ÿค— Transformers Wav2Vec2 + PyCTCDecode Introduction This repo shows how ๐Ÿค— Transformers can be used in combination with kensho-technologies's PyCTCDec

Patrick von Platen 102 Oct 22, 2022
Newt - a Gaussian process library in JAX.

Newt __ \/_ (' \`\ _\, \ \\/ /`\/\ \\ \ \\

AaltoML 0 Nov 02, 2021
Face2webtoon - Despite its importance, there are few previous works applying I2I translation to webtoon.

Despite its importance, there are few previous works applying I2I translation to webtoon. I collected dataset from naver webtoon ์—ฐ์• ํ˜๋ช… and tried to transfer human faces to webtoon domain.

์ด์ƒ์œค 64 Oct 19, 2022
Neural-fractal - Create Fractals Using Complex-Valued Neural Networks!

Neural Fractal Create Fractals Using Complex-Valued Neural Networks! Home Page Features Define Dynamical Systems Using Complex-Valued Neural Networks

Amirabbas Asadi 10 Dec 17, 2022