ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction

Overview

ViSER

Installation with conda

conda env create -f viser.yml
conda activate viser-release
# install softras
cd third_party/softras; python setup.py install; cd -;
# install manifold remeshing
git clone --recursive git://github.com/hjwdzh/Manifold; cd Manifold; mkdir build; cd build; cmake .. -DCMAKE_BUILD_TYPE=Release;make -j8; cd ../../

Data preparation

Create folders to store intermediate data and training logs

mkdir log; mkdir tmp; 

Download pre-processed data (rgb, mask, flow) following the link here and unzip under ./database/DAVIS/. The dataset is organized as:

DAVIS/
    Annotations/
        Full-Resolution/
            sequence-name/
                {%05d}.png
    JPEGImages/
        Full-Resolution/
            sequence-name/
                {%05d}.jpg
    FlowBW/ and FlowFw/
        Full-Resolution/
            sequence-name/ and optionally seqname-name_{%02d}/ (frame interval)
                flo-{%05d}.pfm
                occ-{%05d}.pfm
                visflo-{%05d}.jpg
                warp-{%05d}.jpg

To run preprocessing scripts on other videos, see install.md.

Example: breakdance-flare

Run

bash scripts/template.sh breakdance-flare

To monitor optimization, run

tensorboard --logdir log/

To render optimized breakdance-flare

bash scripts/render_result.sh breakdance-flare log/breakdance-flare-1003-ft2/pred_net_20.pth 36

Example outputs:

Example: elephants

Run

bash scripts/relephant-walk.sh

To monitor optimization, run

tensorboard --logdir log/

To render optimized breakdance-flare

bash scripts/render_elephants.sh log/elephant-walk-1003-6/pred_net_10.pth

Additional Notes

Distributed training

The current codebase supports single-node multi-gpu training with pytorch distributed data-parallel. Please modify dev and ngpu in scripts/template.sh to select devices.

Potential bugs
  • When setting batch_size to 3, rendered flow may become constant values.

Acknowledgement

The code borrows the skeleton of CMR

External repos:

Citation

To cite our paper
@inproceedings{yang2021viser,
  title={ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction},
  author={Yang, Gengshan 
      and Sun, Deqing
      and Jampani, Varun
      and Vlasic, Daniel
      and Cole, Forrester
      and Liu, Ce
      and Ramanan, Deva},
  booktitle = {NeurIPS},
  year={2021}
}  
@inproceedings{yang2021lasr,
  title={LASR: Learning Articulated Shape Reconstruction from a Monocular Video},
  author={Yang, Gengshan 
      and Sun, Deqing
      and Jampani, Varun
      and Vlasic, Daniel
      and Cole, Forrester
      and Chang, Huiwen
      and Ramanan, Deva
      and Freeman, William T
      and Liu, Ce},
  booktitle={CVPR},
  year={2021}
}  

TODO

  • data pre-processing scripts
  • evaluation data and scripts
  • code clean up
[ICCV21] Self-Calibrating Neural Radiance Fields

Self-Calibrating Neural Radiance Fields, ICCV, 2021 Project Page | Paper | Video Author Information Yoonwoo Jeong [Google Scholar] Seokjun Ahn [Google

381 Dec 30, 2022
Tensorflow Tutorials using Jupyter Notebook

Tensorflow Tutorials using Jupyter Notebook TensorFlow tutorials written in Python (of course) with Jupyter Notebook. Tried to explain as kindly as po

Sungjoon 2.6k Dec 22, 2022
AdelaiDepth is an open source toolbox for monocular depth prediction.

AdelaiDepth is an open source toolbox for monocular depth prediction.

Adelaide Intelligent Machines (AIM) Group 743 Jan 01, 2023
Mail classification with tensorflow and MS Exchange Server (ham or spam).

Mail classification with tensorflow and MS Exchange Server (ham or spam).

Metin Karatas 1 Sep 11, 2021
Voxel Transformer for 3D object detection

Voxel Transformer This is a reproduced repo of Voxel Transformer for 3D object detection. The code is mainly based on OpenPCDet. Introduction We provi

173 Dec 25, 2022
Pytorch implementation of "M-LSD: Towards Light-weight and Real-time Line Segment Detection"

M-LSD: Towards Light-weight and Real-time Line Segment Detection Pytorch implementation of "M-LSD: Towards Light-weight and Real-time Line Segment Det

123 Jan 04, 2023
Mengzi Pretrained Models

中文 | English Mengzi 尽管预训练语言模型在 NLP 的各个领域里得到了广泛的应用,但是其高昂的时间和算力成本依然是一个亟需解决的问题。这要求我们在一定的算力约束下,研发出各项指标更优的模型。 我们的目标不是追求更大的模型规模,而是轻量级但更强大,同时对部署和工业落地更友好的模型。

Langboat 424 Jan 04, 2023
This repo contains the code required to train the multivariate time-series Transformer.

Multi-Variate Time-Series Transformer This repo contains the code required to train the multivariate time-series Transformer. Download the data The No

Gregory Duthé 4 Nov 24, 2022
Google AI Open Images - Object Detection Track: Open Solution

Google AI Open Images - Object Detection Track: Open Solution This is an open solution to the Google AI Open Images - Object Detection Track 😃 More c

minerva.ml 46 Jun 22, 2022
Contextual Attention Localization for Offline Handwritten Text Recognition

CALText This repository contains the source code for CALText model introduced in "CALText: Contextual Attention Localization for Offline Handwritten T

0 Feb 17, 2022
FlexConv: Continuous Kernel Convolutions with Differentiable Kernel Sizes

FlexConv: Continuous Kernel Convolutions with Differentiable Kernel Sizes This repository contains the source code accompanying the paper: FlexConv: C

Robert-Jan Bruintjes 96 Dec 12, 2022
Keyword2Text This repository contains the code of the paper: "A Plug-and-Play Method for Controlled Text Generation"

Keyword2Text This repository contains the code of the paper: "A Plug-and-Play Method for Controlled Text Generation", if you find this useful and use

57 Dec 27, 2022
A Closer Look at Structured Pruning for Neural Network Compression

A Closer Look at Structured Pruning for Neural Network Compression Code used to reproduce experiments in https://arxiv.org/abs/1810.04622. To prune, w

Bayesian and Neural Systems Group 140 Dec 05, 2022
WarpDrive: Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning on a GPU

WarpDrive is a flexible, lightweight, and easy-to-use open-source reinforcement learning (RL) framework that implements end-to-end multi-agent RL on a single GPU (Graphics Processing Unit).

Salesforce 334 Jan 06, 2023
HyperCube: Implicit Field Representations of Voxelized 3D Models

HyperCube: Implicit Field Representations of Voxelized 3D Models Authors: Magdalena Proszewska, Marcin Mazur, Tomasz Trzcinski, Przemysław Spurek [Pap

Magdalena Proszewska 3 Mar 09, 2022
This repository contains the database and code used in the paper Embedding Arithmetic for Text-driven Image Transformation

This repository contains the database and code used in the paper Embedding Arithmetic for Text-driven Image Transformation (Guillaume Couairon, Holger

Meta Research 31 Oct 17, 2022
MogFace: Towards a Deeper Appreciation on Face Detection

MogFace: Towards a Deeper Appreciation on Face Detection Introduction In this repo, we propose a promising face detector, termed as MogFace. Our MogFa

48 Dec 20, 2022
(CVPR 2022 Oral) Official implementation for "Surface Representation for Point Clouds"

RepSurf - Surface Representation for Point Clouds [CVPR 2022 Oral] By Haoxi Ran* , Jun Liu, Chengjie Wang ( * : corresponding contact) The pytorch off

Haoxi Ran 264 Dec 23, 2022
This repository is an implementation of paper : Improving the Training of Graph Neural Networks with Consistency Regularization

CRGNN Paper : Improving the Training of Graph Neural Networks with Consistency Regularization Environments Implementing environment: GeForce RTX™ 3090

THUDM 28 Dec 09, 2022
Personal thermal comfort models using digital twins: Preference prediction with BIM-extracted spatial-temporal proximity data from Build2Vec

Personal thermal comfort models using digital twins: Preference prediction with BIM-extracted spatial-temporal proximity data from Build2Vec This repo

Building and Urban Data Science (BUDS) Group 5 Dec 02, 2022