MonoScene: Monocular 3D Semantic Scene Completion

Overview

MonoScene: Monocular 3D Semantic Scene Completion

MonoScene: Monocular 3D Semantic Scene Completion] [arXiv + supp] | [Project page]
Anh-Quan Cao, Raoul de Charette
Inria, Paris, France

If you find this work useful, please cite our paper:

@misc{cao2021monoscene,
      title={MonoScene: Monocular 3D Semantic Scene Completion}, 
      author={Anh-Quan Cao and Raoul de Charette},
      year={2021},
      eprint={2112.00726},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Code and models will be released soon. Please watch this repo for updates.

Demo

SemanticKITTI KITTI-360
(Trained on SemanticKITTI)

NYUv2

Comments
  • TypeError: 'int' object is not subscriptable

    TypeError: 'int' object is not subscriptable

    (monoscene) [email protected]:~/workplace/MonoScene$ python monoscene/scripts/train_monoscene.py dataset=kitti enable_log=true kitti_root=$KITTI_ROOT kitti_preprocess_root=$KITTI_PREPROCESS kitti_logdir=$KITTI_LOG n_gpus=2 batch_size=2 ^[[Dexp_kitti_1_FrusSize_8_nRelations4_WD0.0001_lr0.0001_CEssc_geoScalLoss_semScalLoss_fpLoss_CERel_3DCRP_Proj_2_4_8 n_relations (32, 32, 4) Traceback (most recent call last): File "monoscene/scripts/train_monoscene.py", line 118, in main class_weights=class_weights, File "/home/ruidong/workplace/MonoScene/monoscene/models/monoscene.py", line 80, in init context_prior=context_prior, File "/home/ruidong/workplace/MonoScene/monoscene/models/unet3d_kitti.py", line 62, in init self.feature * 4, self.feature * 4, size_l3, bn_momentum=bn_momentum File "/home/ruidong/workplace/MonoScene/monoscene/models/CRP3D.py", line 15, in init self.flatten_size = size[0] * size[1] * size[2] TypeError: 'int' object is not subscriptable

    Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

    opened by DipDipPotatoChips 21
  • Questions about cross-entropy loss

    Questions about cross-entropy loss

    Dear authors, thanks for your great works! In your paper, you say that "the losses are computed only where y is defined". I wonder if this means you do not add supervision on non-occupied voxels and only use multi-class classification loss on occupied voxels ? If this holds true, why the model can identify which voxels are occupied ?

    opened by weiyithu 13
  • about test

    about test

    FileNotFoundError: [Errno 2] No such file or directory: '/home/ruidong/workplace/MonoScene/trained_models/monoscene_kitti.ckpt'

    the last printing of trainning is: Epoch 29: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 2325/2325 [1:06:52<00:00, 1.73s/it, loss=3.89, v_num=]

    opened by DipDipPotatoChips 13
  • Cuda out of memory

    Cuda out of memory

    Dear author, you said that Use smaller 2D backbone by chaning the basemodel_name and num_features The pretrained model name is here. You can try the efficientnet B5 can reduces the memory, I want to know the B5 weight and the value of num_features?

    opened by lulianLiu 12
  • Pretrained models on other dataset: NuScenes

    Pretrained models on other dataset: NuScenes

    Hi @anhquancao,

    Thanks so much for your paper and your implementation. Do you have your pretrained model on the NuScenes? If yes, could you share it? The reason is that I want to build upon your work on the NuScenes dataset but there exists a large domain gap between the two (SemanticKITTI and NuScenes) so the pretrained on SemanticKITTI works does not well on the NuScenes.

    Thanks!

    opened by ducminhkhoi 11
  • failed to run test

    failed to run test

    When I try to run this script, it crashed without giving any information: python monoscene/scripts/generate_output.py +output_path=$MONOSCENE_OUTPUT dataset=kitti_360 +kitti_360_root=$KITTI_360_ROOT +kitti_360_sequence=2013_05_28_drive_0028_sync n_gpus=1 batch_size=1

    image

    Any suggestion will be much appreciated.

    opened by ChiyuanFeng 9
  • cannot find calib

    cannot find calib

    PS F:\Studying\CY-Workspace\MonoScene-master> python monoscene/scripts/eval_monoscene.py dataset=kitti kitti_root=$KITTI_ROOT kitti_preprocess_root=$KITTI_PREPROCESS n_gpus=1 batch_size= 1 GPU available: True, used: True TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs n_relations 4 Using cache found in C:\Users\DELL/.cache\torch\hub\rwightman_gen-efficientnet-pytorch_master Loading base model ()...Done. Removing last two layers (global_pool & classifier). Building Encoder-Decoder model..Done. Traceback (most recent call last): File "monoscene/scripts/eval_monoscene.py", line 71, in main data_module.setup() File "F:\anaconda\envs\monoscene\lib\site-packages\pytorch_lightning\core\datamodule.py", line 440, in wrapped_fn fn(*args, **kwargs) File "F:\Studying\CY-Workspace\MonoScene-master\monoscene\scripts/../..\monoscene\data\semantic_kitti\kitti_dm.py", line 34, in setup color_jitter=(0.4, 0.4, 0.4), File "F:\Studying\CY-Workspace\MonoScene-master\monoscene\scripts/../..\monoscene\data\semantic_kitti\kitti_dataset.py", line 60, in init os.path.join(self.root, "dataset", "sequences", sequence, "calib.txt") File "F:\Studying\CY-Workspace\MonoScene-master\monoscene\scripts/../..\monoscene\data\semantic_kitti\kitti_dataset.py", line 193, in read_calib with open(calib_path, "r") as f: FileNotFoundError: [Errno 2] No such file or directory: 'dataset\sequences\00\calib.txt'

    opened by cyaccpect 9
  • about visualization

    about visualization

    (monoscene) [email protected]:~/workplace/MonoScene$ python monoscene/scripts/visualization/kitti_vis_pred.py +file=/home/ruidong/workplace/MonoScene/outputs/kitti/08/000000.pkl +dataset=kitt monoscene/scripts/visualization/kitti_vis_pred.py:23: DeprecationWarning: np.float is a deprecated alias for the builtin float. To silence this warning, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here. Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations coords_grid = coords_grid.astype(np.float) Traceback (most recent call last): File "monoscene/scripts/visualization/kitti_vis_pred.py", line 196, in main d=7, File "monoscene/scripts/visualization/kitti_vis_pred.py", line 75, in draw grid_coords = np.vstack([grid_coords.T, voxels.reshape(-1)]).T AttributeError: 'tuple' object has no attribute 'T'

    Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

    opened by DipDipPotatoChips 9
  • Porting the work of this paper to a new dataset

    Porting the work of this paper to a new dataset

    Hello author, first of all thank you for your great work. I want to directly apply your work to the nuscenes dataset, is it possible? Does the nuscenes dataset need point cloud data to assist in generating voxel data?

    opened by yukaizhou 8
  • Can you help me in another paper?

    Can you help me in another paper?

    Hello! Last year, when you reproduced the code SISC(https://github.com/OPEN-AIR-SUN/SISC), you found a bug and solve it! Now, I get the same problem too,can you tell me how to solve it ! Thank you very much!

    opened by WkangLiu 8
  • ImportError: cannot import name 'get_num_classes' from 'torchmetrics.utilities.data'

    ImportError: cannot import name 'get_num_classes' from 'torchmetrics.utilities.data'

    there is something wrong with my machine and I reinstall my ubuntu. I re-gitclone the code and just keep the data.but when I follow the readme to do installation,it print:

    (monoscene) [email protected]:~/workplace/MonoScene$ pip install -e ./ Obtaining file:///home/potato/workplace/MonoScene Installing collected packages: monoscene Running setup.py develop for monoscene Successfully installed monoscene-0.0.0 (monoscene) [email protected]:~/workplace/MonoScene$ python monoscene/scripts/train_monoscene.py dataset=kitti enable_log=true kitti_root=$KITTI_ROOT kitti_preprocess_root=$KITTI_PREPROCESS kitti_logdir=$KITTI_LOG n_gpus=1 batch_size=1 sem_scal_loss=False Traceback (most recent call last): File "monoscene/scripts/train_monoscene.py", line 1, in from monoscene.data.semantic_kitti.kitti_dm import KittiDataModule File "/home/potato/workplace/MonoScene/monoscene/data/semantic_kitti/kitti_dm.py", line 3, in import pytorch_lightning as pl File "/home/potato/anaconda3/envs/monoscene/lib/python3.7/site-packages/pytorch_lightning/init.py", line 20, in from pytorch_lightning import metrics # noqa: E402 File "/home/potato/anaconda3/envs/monoscene/lib/python3.7/site-packages/pytorch_lightning/metrics/init.py", line 15, in from pytorch_lightning.metrics.classification import ( # noqa: F401 File "/home/potato/anaconda3/envs/monoscene/lib/python3.7/site-packages/pytorch_lightning/metrics/classification/init.py", line 14, in from pytorch_lightning.metrics.classification.accuracy import Accuracy # noqa: F401 File "/home/potato/anaconda3/envs/monoscene/lib/python3.7/site-packages/pytorch_lightning/metrics/classification/accuracy.py", line 18, in from pytorch_lightning.metrics.utils import deprecated_metrics, void File "/home/potato/anaconda3/envs/monoscene/lib/python3.7/site-packages/pytorch_lightning/metrics/utils.py", line 22, in from torchmetrics.utilities.data import get_num_classes as _get_num_classes ImportError: cannot import name 'get_num_classes' from 'torchmetrics.utilities.data' (/home/potato/anaconda3/envs/monoscene/lib/python3.7/site-packages/torchmetrics/utilities/data.py)

    opened by DipDipPotatoChips 7
Releases(v0.1)
Owner
Codes from Computer Vision group of RITS Team, Inria
Learning infinite-resolution image processing with GAN and RL from unpaired image datasets, using a differentiable photo editing model.

Exposure: A White-Box Photo Post-Processing Framework ACM Transactions on Graphics (presented at SIGGRAPH 2018) Yuanming Hu1,2, Hao He1,2, Chenxi Xu1,

Yuanming Hu 719 Dec 29, 2022
code for our paper "Source Data-absent Unsupervised Domain Adaptation through Hypothesis Transfer and Labeling Transfer"

SHOT++ Code for our TPAMI submission "Source Data-absent Unsupervised Domain Adaptation through Hypothesis Transfer and Labeling Transfer" that is ext

75 Dec 16, 2022
TransPrompt - Towards an Automatic Transferable Prompting Framework for Few-shot Text Classification

TransPrompt This code is implement for our EMNLP 2021's paper 《TransPrompt:Towards an Automatic Transferable Prompting Framework for Few-shot Text Cla

WangJianing 23 Dec 21, 2022
Python script to download the celebA-HQ dataset from google drive

download-celebA-HQ Python script to download and create the celebA-HQ dataset. WARNING from the author. I believe this script is broken since a few mo

133 Dec 21, 2022
Ensemble Learning Priors Driven Deep Unfolding for Scalable Snapshot Compressive Imaging [PyTorch]

Ensemble Learning Priors Driven Deep Unfolding for Scalable Snapshot Compressive Imaging [PyTorch] Abstract Snapshot compressive imaging (SCI) can rec

integirty 6 Nov 01, 2022
Implementation of FitVid video prediction model in JAX/Flax.

FitVid Video Prediction Model Implementation of FitVid video prediction model in JAX/Flax. If you find this code useful, please cite it in your paper:

Google Research 62 Nov 25, 2022
Learned model to estimate number of distinct values (NDV) of a population using a small sample.

Learned NDV estimator Learned model to estimate number of distinct values (NDV) of a population using a small sample. The model approximates the maxim

2 Nov 21, 2022
Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting This is the origin Pytorch implementation of Informer in the followin

Haoyi 3.1k Dec 29, 2022
MEDS: Enhancing Memory Error Detection for Large-Scale Applications

MEDS: Enhancing Memory Error Detection for Large-Scale Applications Prerequisites cmake and clang Build MEDS supporting compiler $ make Build Using Do

Secomp Lab at Purdue University 34 Dec 14, 2022
Attention-based CNN-LSTM and XGBoost hybrid model for stock prediction

Attention-based CNN-LSTM and XGBoost hybrid model for stock prediction Requirements The code has been tested running under Python 3.7.4, with the foll

zshicode 84 Jan 01, 2023
An original implementation of "Noisy Channel Language Model Prompting for Few-Shot Text Classification"

Channel LM Prompting (and beyond) This includes an original implementation of Sewon Min, Mike Lewis, Hannaneh Hajishirzi, Luke Zettlemoyer. "Noisy Cha

Sewon Min 92 Jan 07, 2023
Tree Nested PyTorch Tensor Lib

DI-treetensor treetensor is a generalized tree-based tensor structure mainly developed by OpenDILab Contributors. Almost all the operation can be supp

OpenDILab 167 Dec 29, 2022
HiddenMarkovModel implements hidden Markov models with Gaussian mixtures as distributions on top of TensorFlow

Class HiddenMarkovModel HiddenMarkovModel implements hidden Markov models with Gaussian mixtures as distributions on top of TensorFlow 2.0 Installatio

Susara Thenuwara 2 Nov 03, 2021
OpenMMLab Model Deployment Toolset

Introduction English | 简体中文 MMDeploy is an open-source deep learning model deployment toolset. It is a part of the OpenMMLab project. Major features F

OpenMMLab 1.5k Dec 30, 2022
Learning kernels to maximize the power of MMD tests

Code for the paper "Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy" (arXiv:1611.04488; published at ICLR 2017), by Douga

Danica J. Sutherland 201 Dec 17, 2022
object detection; robust detection; ACM MM21 grand challenge; Security AI Challenger Phase VII

赛题背景 在商品知识产权领域,知识产权体现为在线商品的设计和品牌。不幸的是,在每一天,存在着非法商户通过一些对抗手段干扰商标识别来逃避侵权,这带来了很高的知识产权风险和财务损失。为了促进先进的多媒体人工智能技术的发展,以保护企业来之不易的创作和想法免受恶意使用和剽窃,因此提出了鲁棒性标识检测挑战赛

65 Dec 22, 2022
C3d-pytorch - Pytorch porting of C3D network, with Sports1M weights

C3D for pytorch This is a pytorch porting of the network presented in the paper Learning Spatiotemporal Features with 3D Convolutional Networks How to

Davide Abati 311 Jan 06, 2023
This repo is about implementing different approaches of pose estimation and also is a sub-task of the smart hospital bed project :smile:

Pose-Estimation This repo is a sub-task of the smart hospital bed project which is about implementing the task of pose estimation 😄 Many thanks to th

Max 11 Oct 17, 2022
Learning Logic Rules for Document-Level Relation Extraction

LogiRE Learning Logic Rules for Document-Level Relation Extraction We propose to introduce logic rules to tackle the challenges of doc-level RE. Equip

41 Dec 26, 2022