MoCoGAN: Decomposing Motion and Content for Video Generation

Last update: Dec 18, 2022

Overview

MoCoGAN: Decomposing Motion and Content for Video Generation

This repository contains an implementation and further details of MoCoGAN: Decomposing Motion and Content for Video Generation by Sergey Tulyakov, Ming-Yu Liu, Xiaodong Yang, Jan Kautz.

CVPR Poster:

Representation

MoCoGAN is a generative model for videos, which generates videos from random inputs. It features separated representations of motion and content, offering control over what is generated. For example, MoCoGAN can generate the same object performing different actions, as well as the same action performed by different objects

Examples of generated videos

We trained MoCoGAN on the MUG Facial Expression Database to generate facial expressions. When fixing the content code and changing the motion code, it generated the same person performs different expressions. When fixing the motion code and changing the content code, it generated different people performs the same expression. In the figure shown below, each column has fixed identity, each row shows the same action:

We trained MoCoGAN on a human action dataset where content is represented by the performer, executing several actions. When fixing the content code and changing the motion code, it generated the same person performs different actions. When fixing the motion code and changing the content code, it generated different people performs the same action. Each pair of images represents the same action executed by different people:

We have collected a large-scale TaiChi dataset including 4.5K videos of TaiChi performers. Below are videos generated by MoCoGAN.

Training MoCoGAN

Please refer to a wiki page

Citation

If you use MoCoGAN in your research please cite our paper:

Sergey Tulyakov, Ming-Yu Liu, Xiaodong Yang, Jan Kautz, "MoCoGAN: Decomposing Motion and Content for Video Generation"

@inproceedings{Tulyakov:2018:MoCoGAN,
 title={{MoCoGAN}: Decomposing motion and content for video generation},
 author={Tulyakov, Sergey and Liu, Ming-Yu and Yang, Xiaodong and Kautz, Jan},
 booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
 pages = {1526--1535},
 year={2018}
}

MoCoGAN: Decomposing Motion and Content for Video Generation

Related tags

Overview

MoCoGAN: Decomposing Motion and Content for Video Generation

Representation

Examples of generated videos

Training MoCoGAN

Citation

Other implementations:

Owner

Sergey Tulyakov

GLANet - The code for Global and Local Alignment Networks for Unpaired Image-to-Image Translation arxiv

GPU-accelerated Image Processing library using OpenCL

RAFT-Stereo: Multilevel Recurrent Field Transforms for Stereo Matching

Learning Confidence for Out-of-Distribution Detection in Neural Networks

Pytorch code for our paper "Feedback Network for Image Super-Resolution" (CVPR2019)

Nonnegative spatial factorization for multivariate count data

A New Open-Source Off-road Environment for Benchmark Generalization of Autonomous Driving

Source Code for AAAI 2022 paper "Graph Convolutional Networks with Dual Message Passing for Subgraph Isomorphism Counting and Matching"

official code for dynamic convolution decomposition

Gesture recognition on Event Data

ALL Snow Removed: Single Image Desnowing Algorithm Using Hierarchical Dual-tree Complex Wavelet Representation and Contradict Channel Loss (HDCWNet)

A Home Assistant custom component for Lobe. Lobe is an AI tool that can classify images.

Rl-quickstart - Reinforcement Learning Quickstart

Code for ACL2021 paper Consistency Regularization for Cross-Lingual Fine-Tuning.

Lightweight, Python library for fast and reproducible experimentation :microscope:

This is the repo for Uncertainty Quantification 360 Toolkit.

A Multi-modal Perception Tracker (MPT) for speaker tracking using both audio and visual modalities

I have created this Virtual Paint Program, in this you can paint(draw) on your screen using hand gestures, created in Python-3 using OpenCV and Mediapipe library. Gestures :- Index Finger for drawing and Index+Middle Finger for changing position and objects.

An open source app to help calm you down when needed.

A demo of how to use JAX to create a simple gravity simulation