ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction

Last update: Nov 25, 2022

Related tags

Overview

ViSER

Installation with conda

conda env create -f viser.yml
conda activate viser-release
# install softras
cd third_party/softras; python setup.py install; cd -;
# install manifold remeshing
git clone --recursive git://github.com/hjwdzh/Manifold; cd Manifold; mkdir build; cd build; cmake .. -DCMAKE_BUILD_TYPE=Release;make -j8; cd ../../

Data preparation

Create folders to store intermediate data and training logs

mkdir log; mkdir tmp;

Download pre-processed data (rgb, mask, flow) following the link here and unzip under ./database/DAVIS/. The dataset is organized as:

DAVIS/
    Annotations/
        Full-Resolution/
            sequence-name/
                {%05d}.png
    JPEGImages/
        Full-Resolution/
            sequence-name/
                {%05d}.jpg
    FlowBW/ and FlowFw/
        Full-Resolution/
            sequence-name/ and optionally seqname-name_{%02d}/ (frame interval)
                flo-{%05d}.pfm
                occ-{%05d}.pfm
                visflo-{%05d}.jpg
                warp-{%05d}.jpg

To run preprocessing scripts on other videos, see install.md.

Example: breakdance-flare

Run

bash scripts/template.sh breakdance-flare

To monitor optimization, run

tensorboard --logdir log/

To render optimized breakdance-flare

bash scripts/render_result.sh breakdance-flare log/breakdance-flare-1003-ft2/pred_net_20.pth 36

Example outputs:

Example: elephants

Run

bash scripts/relephant-walk.sh

To monitor optimization, run

tensorboard --logdir log/

To render optimized breakdance-flare

bash scripts/render_elephants.sh log/elephant-walk-1003-6/pred_net_10.pth

Additional Notes

Distributed training

The current codebase supports single-node multi-gpu training with pytorch distributed data-parallel. Please modify dev and ngpu in scripts/template.sh to select devices.

Potential bugs

When setting batch_size to 3, rendered flow may become constant values.

Acknowledgement

The code borrows the skeleton of CMR

External repos:

Citation

To cite our paper

@inproceedings{yang2021viser,
  title={ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction},
  author={Yang, Gengshan 
      and Sun, Deqing
      and Jampani, Varun
      and Vlasic, Daniel
      and Cole, Forrester
      and Liu, Ce
      and Ramanan, Deva},
  booktitle = {NeurIPS},
  year={2021}
}

@inproceedings{yang2021lasr,
  title={LASR: Learning Articulated Shape Reconstruction from a Monocular Video},
  author={Yang, Gengshan 
      and Sun, Deqing
      and Jampani, Varun
      and Vlasic, Daniel
      and Cole, Forrester
      and Chang, Huiwen
      and Ramanan, Deva
      and Freeman, William T
      and Liu, Ce},
  booktitle={CVPR},
  year={2021}
}

TODO

data pre-processing scripts
evaluation data and scripts
code clean up

ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction

Related tags

Overview

ViSER

Installation with conda

Data preparation

Example: breakdance-flare

Example: elephants

Additional Notes

Acknowledgement

Citation

TODO

Owner

Gengshan Yang

Unofficial implementation of the paper: PonderNet: Learning to Ponder in TensorFlow

Recognize Handwritten Digits using Deep Learning on the browser itself.

Implementation of Gans

Projects of Andfun Yangon

A PyTorch-based library for semi-supervised learning

Airbus Ship Detection Challenge

Constrained Logistic Regression - How to apply specific constraints to logistic regression's coefficients

Aquarius - Enabling Fast, Scalable, Data-Driven Virtual Network Functions

Framework for evaluating ANNS algorithms on billion scale datasets.

Python Auto-ML Package for Tabular Datasets

Large scale and asynchronous Hyperparameter Optimization at your fingertip.

Re-implementation of 'Grokking: Generalization beyond overfitting on small algorithmic datasets'

A certifiable defense against adversarial examples by training neural networks to be provably robust

Code for the ICCV'21 paper "Context-aware Scene Graph Generation with Seq2Seq Transformers"

Does Pretraining for Summarization Reuqire Knowledge Transfer?

Semi-supervised Implicit Scene Completion from Sparse LiDAR

Code repository for "Free View Synthesis", ECCV 2020.

Code + pre-trained models for the paper Keeping Your Eye on the Ball Trajectory Attention in Video Transformers

Code and data for ACL2021 paper Cross-Lingual Abstractive Summarization with Limited Parallel Resources.

Official code for "InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization" (ICLR 2020, spotlight)