MonoRec: Semi-Supervised Dense Reconstruction in Dynamic Environments from a Single Moving Camera

Related tags

Deep LearningMonoRec
Overview

MonoRec

Paper | Video (CVPR) | Video (Reconstruction) | Project Page

This repository is the official implementation of the paper:

MonoRec: Semi-Supervised Dense Reconstruction in Dynamic Environments from a Single Moving Camera

Felix Wimbauer*, Nan Yang*, Lukas Von Stumberg, Niclas Zeller and Daniel Cremers

CVPR 2021 (arXiv)

If you find our work useful, please consider citing our paper:

@InProceedings{wimbauer2020monorec,
  title = {{MonoRec}: Semi-Supervised Dense Reconstruction in Dynamic Environments from a Single Moving Camera},
  author = {Wimbauer, Felix and Yang, Nan and von Stumberg, Lukas and Zeller, Niclas and Cremers, Daniel},
  booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2021},
}

🏗️ ️ Setup

The conda environment for this project can be setup by running the following command:

conda env create -f environment.yml

🏃 Running the Example Script

We provide a sample from the KITTI Odometry test set and a script to run MonoRec on it in example/. To download the pretrained model and put it into the right place, run download_model.sh. You can manually do this by can by downloading the weights from here and unpacking the file to saved/checkpoints/monorec_depth_ref.pth. The example script will plot the keyframe, depth prediction and mask prediction.

cd example
python test_monorec.py

🗃️ Data

In all of our experiments we used the KITTI Odometry dataset for training. For additional evaluations, we used the KITTI, Oxford RobotCar, TUM Mono-VO and TUM RGB-D datasets. All datapaths can be specified in the respective configuration files. In our experiments, we put all datasets into a seperate folder ../data.

KITTI Odometry

To setup KITTI Odometry, download the color images and calibration files from the official website (around 145 GB). Instead of the given velodyne laser data files, we use the improved ground truth depth for evaluation, which can be downloaded from here.

Unzip the color images and calibration files into ../data. The lidar depth maps can be extracted into the given folder structure by running data_loader/scripts/preprocess_kitti_extract_annotated_depth.py.

For training and evaluation, we use the poses estimated by Deep Virtual Stereo Odometry (DVSO). They can be downloaded from here and should be placed under ../data/{kitti_path}/poses_dso. This folder structure is ensured when unpacking the zip file in the {kitti_path} directory.

The auxiliary moving object masks can be downloaded from here. They should be placed under ../data/{kitti_path}/sequences/{seq_num}/mvobj_mask. This folder structure is ensured when unpacking the zip file in the {kitti_path} directory.

Oxford RobotCar

To setup Oxford RobotCar, download the camera model files and the large sample from the official website. Code, as well as, camera extrinsics need to be downloaded from the official GitHub repository. Please move the content of the python folder to data_loaders/oxford_robotcar/. extrinsics/, models/ and sample/ need to be moved to ../data/oxford_robotcar/. Note that for poses we use the official visual odometry poses, which are not provided in the large sample. They need to be downloaded manually from the raw dataset and unpacked into the sample folder.

TUM Mono-VO

Unfortunately, TUM Mono-VO images are provided only in the original, distorted form. Therefore, they need to be undistorted first before fed into MonoRec. To obtain poses for the sequences, we run the publicly available version of Direct Sparse Odometry.

TUM RGB-D

The official sequences can be downloaded from the official website and need to be unpacked under ../data/tumrgbd/{sequence_name}. Note that our provided dataset implementation assumes intrinsics from fr3 sequences. Note that the data loader for this dataset also relies on the code from the Oxford Robotcar dataset.

🏋️ Training & Evaluation

Please stay tuned! Training code will be published soon!

We provide checkpoints for each training stage:

Training stage Download
Depth Bootstrap Link
Mask Bootstrap Link
Mask Refinement Link
Depth Refinement (final model) Link

Run download_model.sh to download the final model. It will automatically get moved to saved/checkpoints.

To reproduce the evaluation results on different datasets, run the following commands:

python evaluate.py --config configs/evaluate/eval_monorec.json        # KITTI Odometry
python evaluate.py --config configs/evaluate/eval_monorec_oxrc.json   # Oxford Robotcar

☁️ Pointclouds

To reproduce the pointclouds depicted in the paper and video, use the following commands:

python create_pointcloud.py --config configs/test/pointcloud_monorec.json       # KITTI Odometry
python create_pointcloud.py --config configs/test/pointcloud_monorec_oxrc.json  # Oxford Robotcar
python create_pointcloud.py --config configs/test/pointcloud_monorec_tmvo.json  # TUM Mono-VO
Owner
Felix Wimbauer
M.Sc. Computer Science, Oxford, TUM, NUS
Felix Wimbauer
This repository is all about spending some time the with the original problem posed by Minsky and Papert

This repository is all about spending some time the with the original problem posed by Minsky and Papert. Working through this problem is a great way to begin learning computer vision.

Jaissruti Nanthakumar 1 Jan 23, 2022
Some tentative models that incorporate label propagation to graph neural networks for graph representation learning in nodes, links or graphs.

Some tentative models that incorporate label propagation to graph neural networks for graph representation learning in nodes, links or graphs.

zshicode 1 Nov 18, 2021
Optimizes image files by converting them to webp while also updating all references.

About Optimizes images by (re-)saving them as webp. For every file it replaced it automatically updates all references. Works on single files as well

Watermelon Wolverine 18 Dec 23, 2022
MEAL V2: Boosting Vanilla ResNet-50 to 80%+ Top-1 Accuracy on ImageNet without Tricks

MEAL-V2 This is the official pytorch implementation of our paper: "MEAL V2: Boosting Vanilla ResNet-50 to 80%+ Top-1 Accuracy on ImageNet without Tric

Zhiqiang Shen 653 Dec 19, 2022
CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP

CLIP-GEN [简体中文][English] 本项目在萤火二号集群上用 PyTorch 实现了论文 《CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP》。 CLIP-GEN 是一个 Language-F

75 Dec 29, 2022
This repository contains python code necessary to replicated the experiments performed in our paper "Invariant Ancestry Search"

InvariantAncestrySearch This repository contains python code necessary to replicated the experiments performed in our paper "Invariant Ancestry Search

Phillip Bredahl Mogensen 0 Feb 02, 2022
Unofficial Tensorflow Implementation of ConvNeXt from A ConvNet for the 2020s

Tensorflow Implementation of "A ConvNet for the 2020s" This is the unofficial Tensorflow Implementation of ConvNeXt from "A ConvNet for the 2020s" pap

DK 11 Oct 12, 2022
Meta Learning for Semi-Supervised Few-Shot Classification

few-shot-ssl-public Code for paper Meta-Learning for Semi-Supervised Few-Shot Classification. [arxiv] Dependencies cv2 numpy pandas python 2.7 / 3.5+

Mengye Ren 501 Jan 08, 2023
An implementation of shampoo

shampoo.pytorch An implementation of shampoo, proposed in Shampoo : Preconditioned Stochastic Tensor Optimization by Vineet Gupta, Tomer Koren and Yor

Ryuichiro Hataya 69 Sep 10, 2022
Face and other object detection using OpenCV and ML Yolo

Object-and-Face-Detection-Using-Yolo- Opencv and YOLO object and face detection is implemented. You only look once (YOLO) is a state-of-the-art, real-

Happy N. Monday 3 Feb 15, 2022
Library for machine learning stacking generalization.

stacked_generalization Implemented machine learning *stacking technic[1]* as handy library in Python. Feature weighted linear stacking is also availab

114 Jul 19, 2022
An efficient 3D semantic segmentation framework for Urban-scale point clouds like SensatUrban, Campus3D, etc.

An efficient 3D semantic segmentation framework for Urban-scale point clouds like SensatUrban, Campus3D, etc.

Zou 33 Jan 03, 2023
Official implementation of "A Unified Objective for Novel Class Discovery", ICCV2021 (Oral)

A Unified Objective for Novel Class Discovery This is the official repository for the paper: A Unified Objective for Novel Class Discovery Enrico Fini

Enrico Fini 118 Dec 26, 2022
Hub is a dataset format with a simple API for creating, storing, and collaborating on AI datasets of any size.

Hub is a dataset format with a simple API for creating, storing, and collaborating on AI datasets of any size. The hub data layout enables rapid transformations and streaming of data while training m

Activeloop 5.1k Jan 08, 2023
Official implementation of the paper "Steganographer Detection via a Similarity Accumulation Graph Convolutional Network"

SAGCN - Official PyTorch Implementation | Paper | Project Page This is the official implementation of the paper "Steganographer detection via a simila

ZHANG Zhi 1 Nov 26, 2021
ICLR2021 (Under Review)

Self-Supervised Time Series Representation Learning by Inter-Intra Relational Reasoning This repository contains the official PyTorch implementation o

Haoyi Fan 58 Dec 30, 2022
Pytorch implementation of U-Net, R2U-Net, Attention U-Net, and Attention R2U-Net.

pytorch Implementation of U-Net, R2U-Net, Attention U-Net, Attention R2U-Net U-Net: Convolutional Networks for Biomedical Image Segmentation https://a

leejunhyun 2k Jan 02, 2023
Simple (but Strong) Baselines for POMDPs

Recurrent Model-Free RL is a Strong Baseline for Many POMDPs Welcome to the POMDP world! This repo provides some simple baselines for POMDPs, specific

Tianwei V. Ni 172 Dec 29, 2022
A Low Complexity Speech Enhancement Framework for Full-Band Audio (48kHz) based on Deep Filtering.

DeepFilterNet A Low Complexity Speech Enhancement Framework for Full-Band Audio (48kHz) based on Deep Filtering. libDF contains Rust code used for dat

Hendrik Schröter 292 Dec 25, 2022
Official implementation of particle-based models (GNS and DPI-Net) on the Physion dataset.

Physion: Evaluating Physical Prediction from Vision in Humans and Machines [paper] Daniel M. Bear, Elias Wang, Damian Mrowca, Felix J. Binder, Hsiao-Y

Hsiao-Yu Fish Tung 18 Dec 19, 2022