Source code for "Progressive Transformers for End-to-End Sign Language Production" (ECCV 2020)

Overview

Progressive Transformers for End-to-End Sign Language Production

Source code for "Progressive Transformers for End-to-End Sign Language Production" (Ben Saunders, Necati Cihan Camgoz, Richard Bowden - ECCV 2020)

Conference video available at https://twitter.com/BenMSaunders/status/1336638886198521857

Usage

Install required packages using the requirements.txt file.

pip install -r requirements.txt

To run, start main.py with arguments "train" and ".\Configs\Base.yaml":

python __main__.py train ./Configs/Base.yaml

An example train.log file can be found in ".\Configs\train.log" and a validation file at ".\Configs\validations.txt"

Back Translation model created from https://github.com/neccam/slt. Back Translation evaluation code coming soon.

Data

Pre-processed Phoenix14T data can be requested via email at [email protected]. If you wish to create the data yourself, please follow below:

Phoenix14T data can be downloaded from https://www-i6.informatik.rwth-aachen.de/~koller/RWTH-PHOENIX-2014-T/ and skeleton joints can be extracted using OpenPose at https://github.com/CMU-Perceptual-Computing-Lab/openpose and lifted to 3D using the 2D to 3D Inverse Kinematics code at https://github.com/gopeith/SignLanguageProcessing under 3DposeEstimator.

Prepare Phoenix14T (or other sign language dataset) data as .txt files for .skel, .gloss, .txt and .files. Data format should be parallel .txt files for "src", "trg" and "files", with each line representing a new sequence:

  • The "src" file contains source sentences, with each line representing new sentence.

  • The "trg" file contains skeleton data of each frame, with a space separating frames. The joints should be divided by 3 to match the scaling I used. Each frame contains 150 joint values and a subsequent counter value, all separated by a space. Each sequence should be separated with a new line. If your data contains 150 joints per frame, please ensure that trg_size is set to 150 in the config file.

  • The "files" file should contain the name of each sequence on a new line.

Examples can be found in /Data/tmp. Data path must be specified in config file.

Pre-Trained Model

A pre-trained Progressive Transformer checkpoint can be downloaded from https://www.dropbox.com/s/l4xmnybp7luz0l3/PreTrained_PTSLP_Model.ckpt?dl=0.

This model has a size of num_layers: 2, num_heads: 4 and embedding_dim: 512, as outlined in ./Configs/Base.yaml. It has been pre-trained on the full PHOENIX14T dataset with the data format as above. The relevant train.log and validations.txt files can be found in .\Configs.

To initialise a model from this checkpoint, pass the --ckpt ./PreTrained_PTSLP_Model.ckpt argument to either train or test modes. Additionally, to initialise the correct src_embed size, the config argument src_vocab: "./Configs/src_vocab.txt" must be set to the location of the src_vocab.txt, found under ./Configs. Please open an issue if this checkpoint cannot be downloaded or loaded.

Reference

If you use this code in your research, please cite the following papers:

@inproceedings{saunders2020progressive,
	title		=	{{Progressive Transformers for End-to-End Sign Language Production}},
	author		=	{Saunders, Ben and Camgoz, Necati Cihan and Bowden, Richard},
	booktitle   	=   	{Proceedings of the European Conference on Computer Vision (ECCV)},
	year		=	{2020}}

@inproceedings{saunders2020adversarial,
	title		=	{{Adversarial Training for Multi-Channel Sign Language Production}},
	author		=	{Saunders, Ben and Camgoz, Necati Cihan and Bowden, Richard},
	booktitle   	=   	{Proceedings of the British Machine Vision Conference (BMVC)},
	year		=	{2020}}

@inproceedings{saunders2021continuous,
	title		=	{{Continuous 3D Multi-Channel Sign Language Production via Progressive Transformers and Mixture Density Networks}},
	author		=	{Saunders, Ben and Camgoz, Necati Cihan and Bowden, Richard},
	booktitle   	=   	{International Journal of Computer Vision (IJCV)},
	year		=	{2021}}

Acknowledgements

This work received funding from the SNSF Sinergia project 'SMILE' (CRSII2 160811), the European Union's Horizon2020 research and innovation programme under grant agreement no. 762021 'Content4All' and the EPSRC project 'ExTOL' (EP/R03298X/1). This work reflects only the authors view and the Commission is not responsible for any use that may be made of the information it contains. We would also like to thank NVIDIA Corporation for their GPU grant.

Owner
PhD Student at University of Surrey Researching Sign Language Production with Computer Vision & Natural Language Processing
Full Stack Deep Learning Labs

Full Stack Deep Learning Labs Welcome! Project developed during lab sessions of the Full Stack Deep Learning Bootcamp. We will build a handwriting rec

Full Stack Deep Learning 1.2k Dec 31, 2022
Black box hyperparameter optimization made easy.

BBopt BBopt aims to provide the easiest hyperparameter optimization you'll ever do. Think of BBopt like Keras (back when Theano was still a thing) for

Evan Hubinger 70 Nov 03, 2022
PyTorch implementation for "Mining Latent Structures with Contrastive Modality Fusion for Multimedia Recommendation"

MIRCO PyTorch implementation for paper: Latent Structures Mining with Contrastive Modality Fusion for Multimedia Recommendation Dependencies Python 3.

Big Data and Multi-modal Computing Group, CRIPAC 9 Dec 08, 2022
Pytorch implementation of the paper "Topic Modeling Revisited: A Document Graph-based Neural Network Perspective"

Graph Neural Topic Model (GNTM) This is the pytorch implementation of the paper "Topic Modeling Revisited: A Document Graph-based Neural Network Persp

Dazhong Shen 8 Sep 14, 2022
StarGAN2 for practice

StarGAN2 for practice This version of StarGAN2 (coined as 'Post-modern Style Transfer') is intended mostly for fellow artists, who rarely look at scie

vadim epstein 87 Sep 24, 2022
This is a Keras-based Python implementation of DeepMask- a complex deep neural network for learning object segmentation masks

NNProject - DeepMask This is a Keras-based Python implementation of DeepMask- a complex deep neural network for learning object segmentation masks. Th

189 Nov 16, 2022
BMVC 2021 Oral: code for BI-GCN: Boundary-Aware Input-Dependent Graph Convolution for Biomedical Image Segmentation

BMVC 2021 BI-GConv: Boundary-Aware Input-Dependent Graph Convolution for Biomedical Image Segmentation Necassary Dependencies: PyTorch 1.2.0 Python 3.

Yanda Meng 15 Nov 08, 2022
A resource for learning about deep learning techniques from regression to LSTM and Reinforcement Learning using financial data and the fitness functions of algorithmic trading

A tour through tensorflow with financial data I present several models ranging in complexity from simple regression to LSTM and policy networks. The s

195 Dec 07, 2022
In-place Parallel Super Scalar Samplesort (IPS⁴o)

In-place Parallel Super Scalar Samplesort (IPS⁴o) This is the implementation of the algorithm IPS⁴o presented in the paper Engineering In-place (Share

82 Dec 22, 2022
Pytorch implementation of Zero-DCE++

Zero-DCE++ You can find more details here: https://li-chongyi.github.io/Proj_Zero-DCE++.html. You can find the details of our CVPR version: https://li

Chongyi Li 157 Dec 23, 2022
Pytorch implementation of paper "Learning Co-segmentation by Segment Swapping for Retrieval and Discovery"

SegSwap Pytorch implementation of paper "Learning Co-segmentation by Segment Swapping for Retrieval and Discovery" [PDF] [Project page] If our project

xshen 41 Dec 10, 2022
Repository of best practices for deep learning in Julia, inspired by fastai

FastAI Docs: Stable | Dev FastAI.jl is inspired by fastai, and is a repository of best practices for deep learning in Julia. Its goal is to easily ena

FluxML 532 Jan 02, 2023
PyTorch implementation of ''Background Activation Suppression for Weakly Supervised Object Localization''.

Background Activation Suppression for Weakly Supervised Object Localization PyTorch implementation of ''Background Activation Suppression for Weakly S

35 Jan 06, 2023
Unofficial PyTorch Implementation of Multi-Singer

Multi-Singer Unofficial PyTorch Implementation of Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus. Requirements See re

SunMail-hub 123 Dec 28, 2022
TFOD-MASKRCNN - Tensorflow MaskRCNN With Python

Tensorflow- MaskRCNN Steps git clone https://github.com/amalaj7/TFOD-MASKRCNN.gi

Amal Ajay 2 Jan 18, 2022
Implementation of popular bandit algorithms in batch environments.

batch-bandits Implementation of popular bandit algorithms in batch environments. Source code to our paper "The Impact of Batch Learning in Stochastic

Danil Provodin 2 Sep 11, 2022
A PyTorch-based open-source framework that provides methods for improving the weakly annotated data and allows researchers to efficiently develop and compare their own methods.

Knodle (Knowledge-supervised Deep Learning Framework) - a new framework for weak supervision with neural networks. It provides a modularization for se

93 Nov 06, 2022
Source code for GNN-LSPE (Graph Neural Networks with Learnable Structural and Positional Representations)

Graph Neural Networks with Learnable Structural and Positional Representations Source code for the paper "Graph Neural Networks with Learnable Structu

Vijay Prakash Dwivedi 180 Dec 22, 2022
nanodet_plus,yolov5_v6.0

OAK_Detection OAK设备上适配nanodet_plus,yolov5_v6.0 Environment pytorch = 1.7.0

炼丹去了 1 Feb 18, 2022
AdamW optimizer for bfloat16 models in pytorch.

Image source AdamW optimizer for bfloat16 models in pytorch. Bfloat16 is currently an optimal tradeoff between range and relative error for deep netwo

Alex Rogozhnikov 8 Nov 20, 2022