SLAMP: Stochastic Latent Appearance and Motion Prediction

Overview

SLAMP: Stochastic Latent Appearance and Motion Prediction

Official implementation of the paper SLAMP: Stochastic Latent Appearance and Motion Prediction (Adil Kaan Akan, Erkut Erdem, Aykut Erdem, Fatma Guney), accepted and presented at ICCV 2021.

Article

Preprint

Project Website

Pretrained Models

Requirements

All models were trained with Python 3.7.6 and PyTorch 1.4.0 using CUDA 10.1.

A list of required Python packages is available in the requirements.txt file.

Datasets

For preparations of datasets, we followed SRVP's code. Please follow the links below if you want to construct the datasets.

Stochastic Moving MNIST

KTH

BAIR

KITTI

For KITTI, you need to download the Raw KITTI dataset and extract the zip files. You can follow the official KITTI page.

A good idea might be preprocessing every image in the dataset so that all of them have a size of (w=310, h=92). Then, you can disable the resizing operation in the data loaders, which will speed up the training.

Cityscapes

For Cityscapes, you need to download leftImg8bit_sequence from the official Cityscapes page.

leftImg8bit_sequence contains 30-frame snippets (17Hz) surrounding each left 8-bit image (-19 | +10) from the train, val, and test sets (150000 images).

A good idea might be preprocessing every image in the dataset so that all of them have a size of (w=256, h=128). Then, you can disable the resizing operation in the data loaders, which will speed up the training.

Training

To train a new model, the script train.py should be used as follows:

Data directory ($DATA_DIR) and $SAVE_DIR must be given using options --data_root $DATA_DIR --log_dir $SAVE_DIR. To use GPU, you need to use --device flag.

  • for Stochastic Moving MNIST:
--n_past 5 --n_future 10 --n_eval 25 --z_dim_app 20 --g_dim_app 128 --z_dim_motion 20
--g_dim_motion 128 --last_frame_skip --running_avg --batch_size 32
  • for KTH:
--dataset kth --n_past 10 --n_future 10 --n_eval 40 --z_dim_app 50 --g_dim_app 128 --z_dim_motion 50 --model vgg
--g_dim_motion 128 --last_frame_skip --running_avg --sch_sampling 25 --batch_size 20
  • for BAIR:
--dataset bair --n_past 2 --n_future 10 --n_eval 30 --z_dim_app 64 --g_dim_app 128 --z_dim_motion 64 --model vgg
--g_dim_motion 128 --last_frame_skip --running_avg --sch_sampling 25 --batch_size 20 --channels 3
  • for KITTI:
--dataset bair --n_past 10 --n_future 10 --n_eval 30 --z_dim_app 32 --g_dim_app 64 --z_dim_motion 32 --batch_size 8
--g_dim_motion 64 --last_frame_skip --running_avg --model vgg --niter 151 --channels 3
  • for Cityscapes:
--dataset bair --n_past 10 --n_future 10 --n_eval 30 --z_dim_app 32 --g_dim_app 64 --z_dim_motion 32 --batch_size 7
--g_dim_motion 64 --last_frame_skip --running_avg --model vgg --niter 151 --channels 3 --epoch_size 1300

Testing

To evaluate a trained model, the script evaluate.py should be used as follows:

python evaluate.py --data_root $DATADIR --log_dir $LOG_DIR --model_path $MODEL_PATH

where $LOG_DIR is a directory where the results will be saved, $DATADIR is the directory containing the test set.

Important note: The directory containing the script should include a directory called lpips_weights which contains v0.1 LPIPS weights (from the official repository of The Unreasonable Effectiveness of Deep Features as a Perceptual Metric).

To run the evaluation on GPU, use the option --device.

Pretrained weight links with Dropbox - For MNIST:
wget https://www.dropbox.com/s/eseisehe2u0epiy/slamp_mnist.pth
  • For KTH:
wget https://www.dropbox.com/s/7m0806nt7xt9bz8/slamp_kth.pth
  • For BAIR:
wget https://www.dropbox.com/s/cl1pzs5trw3ltr0/slamp_bair.pth
  • For KITTI:
wget https://www.dropbox.com/s/p7wdboswakyj7yi/slamp_kitti.pth
  • For Cityscapes:
wget https://www.dropbox.com/s/lzwiivr1irffhsj/slamp_cityscapes.pth

PSNR, SSIM, and LPIPS results reported in the paper were obtained with the following options:

  • for stochastic Moving MNIST:
python evaluate.py --data_root $DATADIR --log_dir $LOG_DIR --model_path $MODEL_PATH --n_past 5 --n_future 20
  • for KTH:
python evaluate.py --data_root $DATADIR --log_dir $LOG_DIR --model_path $MODEL_PATH --n_past 10 --n_future 30
  • for BAIR:
python evaluate.py --data_root $DATADIR --log_dir $LOG_DIR --model_path $MODEL_PATH --n_past 2 --n_future 28
  • for KITTI:
python evaluate.py --data_root $DATADIR --log_dir $LOG_DIR --model_path $MODEL_PATH --n_past 10 --n_future 20
  • for Cityscapes:
python evaluate.py --data_root $DATADIR --log_dir $LOG_DIR --model_path $MODEL_PATH --n_past 10 --n_future 20

To calculate FVD results, you can use calculate_fvd.py script as follows:

python calculate_fvd.py $LOG_DIR $SAMPLE_NAME

where $LOG_DIR is the directory containg the results generated by the evaluate script and $SAMPLE_NAME is the file which contains the samples such as psnr.npz, ssim.npz or lpips.npz. The script will print the FVD value at the end.

How to Cite

Please cite the paper if you benefit from our paper or the repository:

@InProceedings{Akan2021ICCV,
    author    = {Akan, Adil Kaan and Erdem, Erkut and Erdem, Aykut and Guney, Fatma},
    title     = {SLAMP: Stochastic Latent Appearance and Motion Prediction},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {14728-14737}
}

Acknowledgments

We would like to thank SRVP and SVG authors for making their repositories public. This repository contains several code segments from SRVP's repository and SVG's repository. We appreciate the efforts by Berkay Ugur Senocak for cleaning the code before release.

You might also like...
 Exploring Versatile Prior for Human Motion via Motion Frequency Guidance (3DV2021)
Exploring Versatile Prior for Human Motion via Motion Frequency Guidance (3DV2021)

Exploring Versatile Prior for Human Motion via Motion Frequency Guidance This is the codebase for video-based human motion reconstruction in human-mot

MAU: A Motion-Aware Unit for Video Prediction and Beyond, NeurIPS2021

MAU (NeurIPS2021) Zheng Chang, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Yan Ye, Xinguang Xiang, Wen GAo. Official PyTorch Code for "MAU: A Motion-Aware

Kaggle Lyft Motion Prediction for Autonomous Vehicles 4th place solution

Lyft Motion Prediction for Autonomous Vehicles Code for the 4th place solution of Lyft Motion Prediction for Autonomous Vehicles on Kaggle. Discussion

[arXiv] What-If Motion Prediction for Autonomous Driving โ“๐Ÿš—๐Ÿ’จ
[arXiv] What-If Motion Prediction for Autonomous Driving โ“๐Ÿš—๐Ÿ’จ

WIMP - What If Motion Predictor Reference PyTorch Implementation for What If Motion Prediction [PDF] [Dynamic Visualizations] Setup Requirements The W

 Waymo motion prediction challenge 2021: 3rd place solution
Waymo motion prediction challenge 2021: 3rd place solution

Waymo motion prediction challenge 2021: 3rd place solution ๐Ÿ“œ Technical report ๐Ÿ—จ๏ธ Presentation ๐ŸŽ‰ Announcement ๐Ÿ›†Motion Prediction Channel Website ๐Ÿ›†

Multi-Person Extreme Motion Prediction

Multi-Person Extreme Motion Prediction Implementation for paper Wen Guo, Xiaoyu Bie, Xavier Alameda-Pineda, Francesc Moreno-Noguer, Multi-Person Extre

Sound and Cost-effective Fuzzing of Stripped Binaries by Incremental and Stochastic Rewriting
Sound and Cost-effective Fuzzing of Stripped Binaries by Incremental and Stochastic Rewriting

StochFuzz: A New Solution for Binary-only Fuzzing StochFuzz is a (probabilistically) sound and cost-effective fuzzing technique for stripped binaries.

Price-Prediction-For-a-Dream-Home - A machine learning based linear regression trained model for house price prediction.
Price-Prediction-For-a-Dream-Home - A machine learning based linear regression trained model for house price prediction.

Price-Prediction-For-a-Dream-Home ROADMAP TO THIS LINEAR REGRESSION BASED HOUSE PRICE PREDICTION PREDICTION MODEL Import all the dependencies of the p

Doge-Prediction - Coding Club prediction ig

Doge-Prediction Coding Club prediction ig Basically: Create an application that

Comments
  • Details on KTH and BAIR Validation Sets

    Details on KTH and BAIR Validation Sets

    Hi! Thanks for providing the implementation of SLAMP. In the data processing scripts (data/kth.py and data/bair.py), how do you generate kth_valset_40.npz and bair_valset_30.npz? Is it following the SRVP's code for generating test sets? Could you please provide some details on those sets? Thank you!

    opened by hanghang177 4
  • nsample missing arguments

    nsample missing arguments

    Hi during running your code, i was unexpectedly see an error due to missing arguments

    File "/notebooks/slamp/helpers.py", line 362, in eval_step nsample = opt.nsample

    File args.py doesnt have any definition about nsample, what does nsample mean? I suppose it should be the number of samples per batch in evaluation which means eval batch size Thanks for your reading

    opened by eric-le-12 1
Releases(v1.0)
Evaluating Privacy-Preserving Machine Learning in Critical Infrastructures: A Case Study on Time-Series Classification

PPML-TSA This repository provides all code necessary to reproduce the results reported in our paper Evaluating Privacy-Preserving Machine Learning in

Dominik 1 Mar 08, 2022
Code for the paper "There is no Double-Descent in Random Forests"

Code for the paper "There is no Double-Descent in Random Forests" This repository contains the code to run the experiments for our paper called "There

2 Jan 14, 2022
HPRNet: Hierarchical Point Regression for Whole-Body Human Pose Estimation

HPRNet: Hierarchical Point Regression for Whole-Body Human Pose Estimation Official PyTroch implementation of HPRNet. HPRNet: Hierarchical Point Regre

Nermin Samet 53 Dec 04, 2022
A simple baseline for 3d human pose estimation in PyTorch.

3d_pose_baseline_pytorch A PyTorch implementation of a simple baseline for 3d human pose estimation. You can check the original Tensorflow implementat

weigq 312 Jan 06, 2023
Algorithmic encoding of protected characteristics and its implications on disparities across subgroups

Algorithmic encoding of protected characteristics and its implications on disparities across subgroups This repository contains the code for the paper

Team MIRA - BioMedIA 15 Oct 24, 2022
Raindrop strategy for Irregular time series

Graph-Guided Network For Irregularly Sampled Multivariate Time Series Overview This repository contains processed datasets and implementation code for

Zitnik Lab @ Harvard 74 Jan 03, 2023
Voila - Voilร  turns Jupyter notebooks into standalone web applications

Rendering of live Jupyter notebooks with interactive widgets. Introduction Voilร  turns Jupyter notebooks into standalone web applications. Unlike the

Voilร  Dashboards 4.5k Jan 03, 2023
Styled Handwritten Text Generation with Transformers (ICCV 21)

โšก Handwriting Transformers [PDF] Ankan Kumar Bhunia, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan & Mubarak Shah Abstract: We

Ankan Kumar Bhunia 85 Dec 22, 2022
Deep Convolutional Generative Adversarial Networks

Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks Alec Radford, Luke Metz, Soumith Chintala All images in t

Alec Radford 3.4k Dec 29, 2022
load .txt to train YOLOX, same as Yolo others

YOLOX train your data you need generate data.txt like follow format (per line- one image). prepare one data.txt like this: img_path1 x1,y1,x2,y2,clas

LiMingf 18 Aug 18, 2022
3D HourGlass Networks for Human Pose Estimation Through Videos

3D-HourGlass-Network 3D CNN Based Hourglass Network for Human Pose Estimation (3D Human Pose) from videos. This was my summer'18 research project. Dis

Naman Jain 51 Jan 02, 2023
Multi-Task Deep Neural Networks for Natural Language Understanding

New Release We released Adversarial training for both LM pre-training/finetuning and f-divergence. Large-scale Adversarial training for LMs: ALUM code

Xiaodong 2.1k Dec 30, 2022
PyTorch common framework to accelerate network implementation, training and validation

pytorch-framework PyTorch common framework to accelerate network implementation, training and validation. This framework is inspired by works from MML

Dongliang Cao 3 Dec 19, 2022
Normal Learning in Videos with Attention Prototype Network

Codes_APN Official codes of CVPR21 paper: Normal Learning in Videos with Attention Prototype Network (https://arxiv.org/abs/2108.11055) Overview of ou

11 Dec 13, 2022
Official code for our ICCV paper: "From Continuity to Editability: Inverting GANs with Consecutive Images"

GANInversion_with_ConsecutiveImgs Official code for our ICCV paper: "From Continuity to Editability: Inverting GANs with Consecutive Images" https://a

QingyangXu 38 Dec 07, 2022
[์ œ 13ํšŒ ํˆฌ๋น…์Šค ์ปจํผ๋Ÿฐ์Šค] OK Mugle! - ์žฅ๋ฅด๋ถ€ํ„ฐ ๋ฉœ๋กœ๋””๊นŒ์ง€, Content-based Music Recommendation

Ok Mugle! ๐ŸŽต ์žฅ๋ฅด๋ถ€ํ„ฐ ๋ฉœ๋กœ๋””๊นŒ์ง€, Content-based Music Recommendation 'Ok Mugle!'์€ ์ œ13ํšŒ ํˆฌ๋น…์Šค ์ปจํผ๋Ÿฐ์Šค(2022.01.15)์—์„œ ์ง„ํ–‰ํ•œ ์Œ์•… ์ถ”์ฒœ ํ”„๋กœ์ ํŠธ์ž…๋‹ˆ๋‹ค. Description ๐Ÿ“– ๋ณธ ํ”„๋กœ์ ํŠธ์—์„œ๋Š” Kakao

SeongBeomLEE 5 Oct 09, 2022
A Tensorflow implementation of the Text Conditioned Auxiliary Classifier Generative Adversarial Network for Generating Images from text descriptions

A Tensorflow implementation of the Text Conditioned Auxiliary Classifier Generative Adversarial Network for Generating Images from text descriptions

Ayushman Dash 93 Aug 04, 2022
Official PyTorch implementation of "Evolving Search Space for Neural Architecture Search"

Evolving Search Space for Neural Architecture Search Usage Install all required dependencies in requirements.txt and replace all ..path/..to in the co

Yuanzheng Ci 10 Oct 24, 2022
The implementation of "Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer"

Shuffle Transformer The implementation of "Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer" Introduction Very recently, window-

87 Nov 29, 2022
Context Decoupling Augmentation for Weakly Supervised Semantic Segmentation

Context Decoupling Augmentation for Weakly Supervised Semantic Segmentation The code of: Context Decoupling Augmentation for Weakly Supervised Semanti

54 Dec 12, 2022