Unofficial implementation of One-Shot Free-View Neural Talking Head Synthesis

Last update: Dec 30, 2022

Related tags

Overview

face-vid2vid

Usage

Dataset Preparation

cd datasets
wget https://yt-dl.org/downloads/latest/youtube-dl -O youtube-dl
chmod a+rx youtube-dl
python load_videos.py --workers=8
cd ..

Pretrained Headpose Estimator

300W-LP, alpha 1, robust to image quality

Put hopenet_robust_alpha1.pkl here

Train

python train.py --batch_size=4 --gpu_ids=0,1,2,3 --num_epochs=100 (--ckp=10)

On 2080Ti, setting batch_size=4 makes up gpu memory

Evaluate

Reconstruction：

python evaluate.py --ckp=99 --source=r --driving=datasets/vox/test/id10280#NXjT3732Ekg#001093#001192.mp4

The first frame is used as source by default

Motion transfer：

python evaluate.py --ckp=99 --source=test.png --driving=datasets/vox/test/id10280#NXjT3732Ekg#001093#001192.mp4

Example after training for 7 days on 4 2080Ti:

Face Frontalization：

python evaluate.py --ckp=99 --source=f --driving=datasets/vox/train/id10192#S5yV10aCP7A#003200#003334.mp4

Acknowlegement

Thanks to NV, Imaginaire, AliaksandrSiarohin and DeepHeadPose

Unofficial implementation of One-Shot Free-View Neural Talking Head Synthesis

Related tags

Overview

face-vid2vid

Usage

Dataset Preparation

Pretrained Headpose Estimator

Train

Evaluate

Acknowlegement

Owner

worstcoder

A simple python module to generate anchor (aka default/prior) boxes for object detection tasks.

Official Code for AdvRush: Searching for Adversarially Robust Neural Architectures (ICCV '21)

Example of a Quantum LSTM

The easiest tool for extracting radiomics features and training ML models on them.

Based on Yolo's low-power, ultra-lightweight universal target detection algorithm, the parameter is only 250k, and the speed of the smart phone mobile terminal can reach ~300fps+

(Python, R, C/C++) Isolation Forest and variations such as SCiForest and EIF, with some additions (outlier detection + similarity + NA imputation)

Clockwork Convnets for Video Semantic Segmentation

Implementation of a Transformer that Ponders, using the scheme from the PonderNet paper

GazeScroller - Using Facial Movements to perform Hands-free Gesture on the system

DeceFL: A Principled Decentralized Federated Learning Framework

Dataset para entrenamiento de yoloV3 para 4 clases

GarmentNets: Category-Level Pose Estimation for Garments via Canonical Space Shape Completion

Implementation of self-attention mechanisms for general purpose. Focused on computer vision modules. Ongoing repository.

ROCKET: Exceptionally fast and accurate time series classification using random convolutional kernels

Self-Supervised Monocular DepthEstimation with Internal Feature Fusion(arXiv), BMVC2021

Python3 Implementation of (Subspace Constrained) Mean Shift Algorithm in Euclidean and Directional Product Spaces

DSTC10 Track 2 - Knowledge-grounded Task-oriented Dialogue Modeling on Spoken Conversations

Code release for NeuS

NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling

Easy-to-use micro-wrappers for Gym and PettingZoo based RL Environments