Neural Articulated Radiance Field

Related tags

Deep LearningNARF
Overview

Neural Articulated Radiance Field

NARF

Neural Articulated Radiance Field
Atsuhiro Noguchi, Xiao Sun, Stephen Lin, Tatsuya Harada
ICCV 2021

[Paper] [Code]

Abstract

We present Neural Articulated Radiance Field (NARF), a novel deformable 3D representation for articulated objects learned from images. While recent advances in 3D implicit representation have made it possible to learn models of complex objects, learning pose-controllable representations of articulated objects remains a challenge, as current methods require 3D shape supervision and are unable to render appearance. In formulating an implicit representation of 3D articulated objects, our method considers only the rigid transformation of the most relevant object part in solving for the radiance field at each 3D location. In this way, the proposed method represents pose-dependent changes without significantly increasing the computational complexity. NARF is fully differentiable and can be trained from images with pose annotations. Moreover, through the use of an autoencoder, it can learn appearance variations over multiple instances of an object class. Experiments show that the proposed method is efficient and can generalize well to novel poses.

Method

We extend Neural Radiance Fields (NeRF) to articulated objects. NARF is a NeRF conditioned on skeletal parameters and skeletal posture, and is an MLP that outputs the density and color of a point with 3D position and 2D viewing direction as input. Since articulated objects can be regarded as multiple rigid bodies connected by joints, the following two assumptions can be made

  • The density of each part does not change in the coordinate system fixed to the part.
  • A point on the surface of the object belongs to only one of the parts.

Therefore, we transform the input 3D coordinates into local coordinates of each part and use them as input for the model. From the second hypothesis, we use selector MLP to select only one necessary coordinate and mask the others.

An overview of the model is shown in the figure.

overview

The model is trained with the L2 loss between the generated image and the ground truth image.

Results

The proposed NARF is capable of rendering images with explicit control of the viewpoint, bone pose, and bone parameters. These representations are disentangled and can be controlled independently.

Viewpoint change (seen in training)

Pose change (unseen in training)

Bone length change (unseen in training)

NARF generalizes well to unseen viewpoints during training.

Furthermore, NARF can render segmentation for each part by visualizing the output values of the selector.

NARF can learn appearance variations by combining it with an autoencoder. The video below visualizes the disentangled representations and segmentation masks learned by NARF autoencoder.

Code

Envirionment

python 3.7.*
pytorch >= 1.7.1
torchvision >= 0.8.2

pip install tensorboardx pyyaml opencv-python pandas ninja easydict tqdm scipy scikit-image

Dataset preparation

THUman

Please refer to https://github.com/nogu-atsu/NARF/tree/master/data/THUman

Your own dataset

Coming soon.

Training

  • Write config file like NARF/configs/THUman/results_wxl_20181008_wlz_3_M/NARF_D.yml. Do not change default.yml

    • out_root: root directory to save models
    • out: experiment name
    • data_root: directory the dataset is in
  • Run training specifying a config file

    CUDA_VISIBLE_DEVICES=0 python train.py --config NARF/configs/[your_config.yml] --num_workers 1

  • Distributed data parallel

    python train_ddp.py --config NARF/configs/[your_config.yml] --gpus 4 --num_workers 1

Validation

  • Single gpu

    python train.py --config NARF/configs/[your_config.yml] --num_workers 1 --validation --resume_latest

  • Multiple gpus

    python train_ddp.py --config NARF/configs/[your_config.yml] --gpus 4 --num_workers 1 --validation --resume_latest

  • The results are saved to val_metrics.json in the same directory as the snapshots.

Computational cost

python computational_cost.py --config NARF/configs/[your_config.yml]

Visualize results

  • Generate interpolation videos

    cd visualize
    python NARF_interpolation.py --config ../NARF/configs/[your_config.yml]
    

    The results are saved to the same directory as the snapshots. With the default settings, it takes 30 minutes on a V100 gpu to generate a 30-frame video

Acknowledgement

https://github.com/rosinality/stylegan2-pytorch
https://github.com/ZhengZerong/DeepHuman
https://smpl.is.tue.mpg.de/

BibTex

@inproceedings{2021narf,
  author    = {Noguchi, Atsuhiro and Sun, Xiao and Lin, Stephen and Harada, Tatsuya},
  title     = {Neural Articulated Radiance Field},
  booktitle = {International Conference on Computer Vision},
  year      = {2021},
}
Owner
Atsuhiro Noguchi
Atsuhiro Noguchi
The codes and models in 'Gaze Estimation using Transformer'.

GazeTR We provide the code of GazeTR-Hybrid in "Gaze Estimation using Transformer". We recommend you to use data processing codes provided in GazeHub.

65 Dec 27, 2022
Code for the paper Hybrid Spectrogram and Waveform Source Separation

Demucs Music Source Separation This is the 3rd release of Demucs (v3), featuring hybrid source separation. For the waveform only Demucs (v2): Go this

Meta Research 4.8k Jan 04, 2023
PyTorch implementation of Value Iteration Networks (VIN): Clean, Simple and Modular. Visualization in Visdom.

VIN: Value Iteration Networks This is an implementation of Value Iteration Networks (VIN) in PyTorch to reproduce the results.(TensorFlow version) Key

Xingdong Zuo 215 Dec 07, 2022
The implementation of CVPR2021 paper Temporal Query Networks for Fine-grained Video Understanding, by Chuhan Zhang, Ankush Gupta and Andrew Zisserman.

Temporal Query Networks for Fine-grained Video Understanding 📋 This repository contains the implementation of CVPR2021 paper Temporal_Query_Networks

55 Dec 21, 2022
Pseudo lidar - (CVPR 2019) Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving

Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving This paper has been accpeted by Conference o

Yan Wang 881 Dec 27, 2022
Progressive Domain Adaptation for Object Detection

Progressive Domain Adaptation for Object Detection Implementation of our paper Progressive Domain Adaptation for Object Detection, based on pytorch-fa

96 Nov 25, 2022
code associated with ACL 2021 DExperts paper

DExperts Hi! This repository contains code for the paper DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts to appear at

Alisa Liu 68 Dec 15, 2022
[ICCV 2021] Focal Frequency Loss for Image Reconstruction and Synthesis

Focal Frequency Loss - Official PyTorch Implementation This repository provides the official PyTorch implementation for the following paper: Focal Fre

Liming Jiang 460 Jan 04, 2023
Implementation of association rules mining algorithms (Apriori|FPGrowth) using python.

Association Rules Mining Using Python Implementation of association rules mining algorithms (Apriori|FPGrowth) using python. As a part of hw1 code in

Pre 2 Nov 10, 2021
Package for working with hypernetworks in PyTorch.

Package for working with hypernetworks in PyTorch.

Christian Henning 71 Jan 05, 2023
Ascend your Jupyter Notebook usage

Jupyter Ascending Sync Jupyter Notebooks from any editor About Jupyter Ascending lets you edit Jupyter notebooks from your favorite editor, then insta

Untitled AI 254 Jan 08, 2023
An open source app to help calm you down when needed.

By: Seanpm2001, Et; Al. Top README.md Read this article in a different language Sorted by: A-Z Sorting options unavailable ( af Afrikaans Afrikaans |

Sean P. Myrick V19.1.7.2 2 Oct 24, 2022
Spatial Contrastive Learning for Few-Shot Classification (SCL)

This repo contains the official implementation of Spatial Contrastive Learning for Few-Shot Classification (SCL), which presents of a novel contrastive learning method applied to few-shot image class

Yassine 34 Dec 25, 2022
Any-to-any voice conversion using synthetic specific-speaker speeches as intermedium features

MediumVC MediumVC is an utterance-level method towards any-to-any VC. Before that, we propose SingleVC to perform A2O tasks(Xi → Ŷi) , Xi means utter

谷下雨 47 Dec 25, 2022
EGNN - Implementation of E(n)-Equivariant Graph Neural Networks, in Pytorch

EGNN - Pytorch Implementation of E(n)-Equivariant Graph Neural Networks, in Pytorch. May be eventually used for Alphafold2 replication. This

Phil Wang 259 Jan 04, 2023
CVPR 2020 oral paper: Overcoming Classifier Imbalance for Long-tail Object Detection with Balanced Group Softmax.

Overcoming Classifier Imbalance for Long-tail Object Detection with Balanced Group Softmax ⚠️ Latest: Current repo is a complete version. But we delet

FishYuLi 341 Dec 23, 2022
This repo contains research materials released by members of the Google Brain team in Tokyo.

Brain Tokyo Workshop 🧠 🗼 This repo contains research materials released by members of the Google Brain team in Tokyo. Past Projects Weight Agnostic

Google 1.2k Jan 02, 2023
Official implementation for "Low-light Image Enhancement via Breaking Down the Darkness"

Low-light Image Enhancement via Breaking Down the Darkness by Qiming Hu, Xiaojie Guo. 1. Dependencies Python3 PyTorch=1.0 OpenCV-Python, TensorboardX

Qiming Hu 30 Jan 01, 2023
TensorFlow Implementation of "Show, Attend and Tell"

Show, Attend and Tell Update (December 2, 2016) TensorFlow implementation of Show, Attend and Tell: Neural Image Caption Generation with Visual Attent

Yunjey Choi 902 Nov 29, 2022
Modified fork of Xuebin Qin's U-2-Net Repository. Used for demonstration purposes.

U^2-Net (U square net) Modified version of U2Net used for demonstation purposes. Paper: U^2-Net: Going Deeper with Nested U-Structure for Salient Obje

Shreyas Bhat Kera 13 Aug 28, 2022