pytorch implementation of the ICCV'21 paper "MVTN: Multi-View Transformation Network for 3D Shape Recognition"

Last update: Jan 03, 2023

Overview

MVTN: Multi-View Transformation Network for 3D Shape Recognition (ICCV 2021)

By Abdullah Hamdi, Silvio Giancola, Bernard Ghanem

Paper | Video | Tutorial .

The official Pytroch code of ICCV 2021 paper MVTN: Multi-View Transformation Network for 3D Shape Recognition. MVTN learns to transform the rendering parameters of a 3D object to improve the perspectives for better recognition by multi-view netowkrs. Without extra supervision or add loss, MVTN improve the performance in 3D classification and shape retrieval. MVTN achieves state-of-the-art performance on ModelNet40, ShapeNet Core55, and the most recent and realistic ScanObjectNN dataset (up to 6% improvement).

Citation

If you find our work useful in your research, please consider citing:

@InProceedings{Hamdi_2021_ICCV,
    author    = {Hamdi, Abdullah and Giancola, Silvio and Ghanem, Bernard},
    title     = {MVTN: Multi-View Transformation Network for 3D Shape Recognition},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {1-11}
}

Requirement

This code is tested with Python 3.7 and Pytorch >= 1.5

install Pytorch3d as follows

conda create -y -n MVTN python=3.7
conda activate MVTN
conda install -c pytorch pytorch=1.7.1 torchvision cudatoolkit=10.2
conda install -c fvcore -c iopath -c conda-forge fvcore iopath
conda install -c bottler nvidiacub
conda install pytorch3d -c pytorch3d

install other helper libraries

conda install pandas
conda install -c conda-forge trimesh
pip install einops imageio scipy matplotlib tensorboard h5py metric-learn

Usage: 3D Classification & Retrieval

The main Python script in the root directorty run_mvtn.py.

First download the datasets and unzip inside the data/ directories as follows:

ModelNet40 this link (ModelNet objects meshes are simplified to fit the GPU and allows for backpropogation ).
ShapeNet Core55 v2 this link ( You need to create an account)
ScanObjectNN this link (ScanObjectNN with its three main variants [obj_only ,with_bg , hardest] controlled by the --dset_variant option ).

Then you can run MVTN with

python run_mvtn.py --data_dir data/ModelNet40/ --run_mode train --mvnetwork mvcnn --nb_views 8 --views_config learned_spherical

--data_dir the data directory. The dataloader is picked adaptively from custom_dataset.py based on the choice between "ModelNet40", "ShapeNetCore.v2", or the "ScanObjectNN" choice.
--run_mode is the run mode. choices: "train"(train for classification), "test_cls"(test classification after training), "test_retr"(test retrieval after training), "test_rot"(test rotation robustness after training), "test_occ"(test occlusion robustness after training)
--mvnetwork is the multi-view network used in the pipeline. Choices: "mvcnn" , "rotnet", "viewgcn"
--views_config is one of six view selection methods that are either learned or heuristics : choices: "circular", "random", "spherical" "learned_circular" , "learned_spherical" , "learned_direct". Only the ones that are learned are MVTN variants.
--resume a flag to continue training from last checkpoint.
--pc_rendering : a flag if you want to use point clouds instead of mesh data and point cloud rendering instead of mesh rendering. This should be default when only point cloud data is available ( like in ScanObjectNN dataset)
--object_color: is the uniform color of the mesh or object rendered. default="white", choices=["white", "random", "black", "red", "green", "blue", "custom"]

Other parameters can be founded in config.yaml configuration file or run python run_mvtn.py -h. The default parameters are the ones used in the paper.

The results will be saved in results/00/0001/ folder that contaions the camera view points and the renderings of some example as well the checkpoints and the logs.

Note: For best performance on point cloud tasks, please set canonical_distance : 1.0 in the config.yaml file. For mesh tasks, keep as is.

Other files

models/renderer.py contains the main Pytorch3D differentiable renderer class that can render multi-view images for point clouds and meshes adaptively.
models/mvtn.py contains a standalone class for MVTN that can be used with any other pipeline.
custom_dataset.py includes all the pytorch dataloaders for 3D datasets: ModelNet40, SahpeNet core55 ,ScanObjectNN, and ShapeNet Parts
blender_simplify.py is the Blender code used to simplify the meshes with simplify_mesh function from util.py as the following :

simplify_ratio  = 0.05 # the ratio of faces to be maintained after simplification 
input_mesh_file = os.path.join(data_dir,"ModelNet40/plant/train/plant_0014.off") 
mymesh, reduced_mesh = simplify_mesh(input_mesh_file,simplify_ratio=simplify_ratio)

The output simplified mesh will be saved in the same directory of the original mesh with "SMPLER" appended to the name

Misc

Please open an issue or contact Abdullah Hamdi ([email protected]) if there is any question.

Acknoledgements

This paper and repo borrows codes and ideas from several great github repos: MVCNN pytorch , view GCN, RotationNet and most importantly the great Pytorch3D library.

License

The code is released under MIT License (see LICENSE file for details).

pytorch implementation of the ICCV'21 paper "MVTN: Multi-View Transformation Network for 3D Shape Recognition"

Related tags

Overview

MVTN: Multi-View Transformation Network for 3D Shape Recognition (ICCV 2021)

Paper | Video | Tutorial .

Citation

Requirement

Usage: 3D Classification & Retrieval

Other files

Misc

Acknoledgements

License

Owner

Abdullah Hamdi

Code repository for "Free View Synthesis", ECCV 2020.

A Collection of Papers and Codes for ICCV2021 Low Level Vision and Image Generation

IhoneyBakFileScan Modify - 批量网站备份文件扫描器，增加文件规则，优化内存占用

Code release for paper: The Boombox: Visual Reconstruction from Acoustic Vibrations

Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition in CVPR19

StarGAN v2 - Official PyTorch Implementation (CVPR 2020)

This is the pytorch implementation for the paper: Learning Accurate Performance Predictors for Ultrafast Automated Model Compression, which is in submission to TPAMI

A Python library for common tasks on 3D point clouds

TensorFlow Implementation of "Show, Attend and Tell"

A method to perform unsupervised cross-region adaptation of crop classifiers trained with satellite image time series.

Domain Generalization for Mammography Detection via Multi-style and Multi-view Contrastive Learning

This repository contains a pytorch implementation of "StereoPIFu: Depth Aware Clothed Human Digitization via Stereo Vision".

Generate Cartoon Images using Generative Adversarial Network

Controlling the MicriSpotAI robot from scratch

Official implementation of "GS-WGAN: A Gradient-Sanitized Approach for Learning Differentially Private Generators" (NeurIPS 2020)

Official repository for "Restormer: Efficient Transformer for High-Resolution Image Restoration". SOTA for motion deblurring, image deraining, denoising (Gaussian/real data), and defocus deblurring.

Individual Tree Crown classification on WorldView-2 Images using Autoencoder -- Group 9 Weak learners - Final Project (Machine Learning 2020 Course)

Out-of-Domain Human Mesh Reconstruction via Dynamic Bilevel Online Adaptation

Bayesian Deep Learning and Deep Reinforcement Learning for Object Shape Error Response and Correction of Manufacturing Systems

Official PyTorch Implementation of SSMix (Findings of ACL 2021)

pytorch implementation of the ICCV'21 paper "MVTN: Multi-View Transformation Network for 3D Shape Recognition"

Related tags

Overview

MVTN: Multi-View Transformation Network for 3D Shape Recognition (ICCV 2021)

Paper | Video | Tutorial .

Citation

Requirement

Usage: 3D Classification & Retrieval

Other files

Misc

Acknoledgements

License

Owner

Abdullah Hamdi

Code repository for "Free View Synthesis", ECCV 2020.

A Collection of Papers and Codes for ICCV2021 Low Level Vision and Image Generation

IhoneyBakFileScan Modify - 批量网站备份文件扫描器，增加文件规则，优化内存占用

Code release for paper: The Boombox: Visual Reconstruction from Acoustic Vibrations

Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition in CVPR19

StarGAN v2 - Official PyTorch Implementation (CVPR 2020)

This is the pytorch implementation for the paper: *Learning Accurate Performance Predictors for Ultrafast Automated Model Compression*, which is in submission to TPAMI

A Python library for common tasks on 3D point clouds

TensorFlow Implementation of "Show, Attend and Tell"

A method to perform unsupervised cross-region adaptation of crop classifiers trained with satellite image time series.

Domain Generalization for Mammography Detection via Multi-style and Multi-view Contrastive Learning

This repository contains a pytorch implementation of "StereoPIFu: Depth Aware Clothed Human Digitization via Stereo Vision".

Generate Cartoon Images using Generative Adversarial Network

Controlling the MicriSpotAI robot from scratch

Official implementation of "GS-WGAN: A Gradient-Sanitized Approach for Learning Differentially Private Generators" (NeurIPS 2020)

Official repository for "Restormer: Efficient Transformer for High-Resolution Image Restoration". SOTA for motion deblurring, image deraining, denoising (Gaussian/real data), and defocus deblurring.

Individual Tree Crown classification on WorldView-2 Images using Autoencoder -- Group 9 Weak learners - Final Project (Machine Learning 2020 Course)

Out-of-Domain Human Mesh Reconstruction via Dynamic Bilevel Online Adaptation

Bayesian Deep Learning and Deep Reinforcement Learning for Object Shape Error Response and Correction of Manufacturing Systems

Official PyTorch Implementation of SSMix (Findings of ACL 2021)

This is the pytorch implementation for the paper: Learning Accurate Performance Predictors for Ultrafast Automated Model Compression, which is in submission to TPAMI