Instance-wise Occlusion and Depth Orders in Natural Scenes (CVPR 2022)

Last update: Dec 27, 2022

Related tags

Overview

Instance-wise Occlusion and Depth Orders in Natural Scenes

Official source code. Appears at CVPR 2022

This repository provides a new dataset, named InstaOrder, that can be used to understand the geometrical relationships of instances in an image. The dataset consists of 2.9M annotations of geometric orderings for class-labeled instances in 101K natural scenes. The scenes were annotated by 3,659 crowd-workers regarding (1) occlusion order that identifies occluder/occludee and (2) depth order that describes ordinal relations that consider relative distance from the camera. This repository also introduce a geometric order prediction network called InstaOrderNet, which is superior to state-of-the-art approaches.

Installation

This code has been developed under Anaconda(Python 3.6), Pytorch 1.7.1, torchvision 0.8.2 and CUDA 10.1. Please install following environments:

# build conda environment
conda create --name order python=3.6
conda activate order

# install requirements
pip install -r requirements.txt

# install COCO API
pip install 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'

Visualization

Check InstaOrder_vis.ipynb to visualize InstaOrder dataset including instance masks, occlusion order, and depth order.

Training

The experiments folder contains train and test scripts of experiments demonstrated in the paper.

To train {MODEL} with {DATASET},

Download {DATASET} following this.
Set ${base_dir} correctly in experiments/{DATASET}/{MODEL}/config.yaml
(Optional) To train InstaDepthNet, download MiDaS-v2.1 model-f6b98070.pt under ${base_dir}/data/out/InstaOrder_ckpt

Run the script file as follow:

sh experiments/{DATASET}/{MODEL}/train.sh

# Example of training InstaOrderNet^o (Table3 in the main paper) from the scratch
sh experiments/InstaOrder/InstaOrderNet_o/train.sh

Inference

Download pretrained models InstaOrder_ckpt.zip (3.5G) and unzip files following the below structure. Pretrained models are named by {DATASET}_{MODEL}.pth.tar

${base_dir}
|--data
|    |--out
|    |    |--InstaOrder_ckpt
|    |    |    |--COCOA_InstaOrderNet_o.pth.tar
|    |    |    |--COCOA_OrderNet.pth.tar
|    |    |    |--COCOA_pcnet_m.pth.tar
|    |    |    |--InstaOrder_InstaDepthNet_d.pth.tar
|    |    |    |--InstaOrder_InstaDepthNet_od.pth.tar
|    |    |    |--InstaOrder_InstaOrderNet_d.pth.tar
|    |    |    |--InstaOrder_InstaOrderNet_o.pth.tar
|    |    |    |--InstaOrder_InstaOrderNet_od.pth.tar
|    |    |    |--InstaOrder_OrderNet.pth.tar
|    |    |    |--InstaOrder_OrderNet_ext.pth.tar  
|    |    |    |--InstaOrder_pcnet_m.pth.tar
|    |    |    |--KINS_InstaOrderNet_o.pth.tar
|    |    |    |--KINS_OrderNet.pth.tar
|    |    |    |--KINS_pcnet_m.pth.tar

(Optional) To test InstaDepthNet, download MiDaS-v2.1 model-f6b98070.pt under ${base_dir}/data/out/InstaOrder_ckpt
Set ${base_dir} correctly in experiments/{DATASET}/{MODEL}/config.yaml

To test {MODEL} with {DATASET}, run the script file as follow:

sh experiments/{DATASET}/{MODEL}/test.sh

# Example of reproducing the accuracy of InstaOrderNet^o (Table3 in the main paper)
sh experiments/InstaOrder/InstaOrderNet_o/test.sh

Datasets

InstaOrder dataset

To use InstaOrder, download files following the below structure

${base_dir}
|--data
|    |--COCO
|    |    |--train2017/
|    |    |--val2017/
|    |    |--annotations/
|    |    |    |--instances_train2017.json
|    |    |    |--instances_val2017.json
|    |    |    |--InstaOrder_train2017.json
|    |    |    |--InstaOrder_val2017.json

COCOA dataset

To use COCOA, download files following the below structure

${base_dir}
|--data
|    |--COCO
|    |    |--train2014/
|    |    |--val2014/
|    |    |--annotations/
|    |    |    |--COCO_amodal_train2014.json 
|    |    |    |--COCO_amodal_val2014.json
|    |    |    |--COCO_amodal_val2014.json

KINS dataset

To use KINS, download files following the below structure

KINS dataset

${base_dir}
|--data
|    |--KINS
|    |    |--training/
|    |    |--testing/
|    |    |--instances_val.json
|    |    |--instances_train.json

DIW dataset

To use DIW, download files following the below structure

DIW Dataset

${base_dir}
|--data
|    |--DIW
|    |    |--DIW_test/
|    |    |--DIW_Annotations
|    |    |    |--DIW_test.csv

Citing InstaOrder

If you find this code/data useful in your research then please cite our paper:

@inproceedings{lee2022instaorder,
  title={{Instance-wise Occlusion and Depth Orders in Natural Scenes}},
  author={Hyunmin Lee and Jaesik Park},
  booktitle={Proceedings of the {IEEE} Conference on Computer Vision and Pattern Recognition},
  year={2022}
}

Acknowledgement

We have reffered to and borrowed the implementations from Xiaohang Zhan

Instance-wise Occlusion and Depth Orders in Natural Scenes (CVPR 2022)

Related tags

Overview

Instance-wise Occlusion and Depth Orders in Natural Scenes

Installation

Visualization

Training

Inference

Datasets

InstaOrder dataset

COCOA dataset

KINS dataset

DIW dataset

Citing InstaOrder

Acknowledgement

Owner

🤗 Push your spaCy pipelines to the Hugging Face Hub

BraTs-VNet - BraTS(Brain Tumour Segmentation) using V-Net

CAMoE + Dual SoftMax Loss (DSL): Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax Loss

Custom implementation of Corrleation Module

Try out deep learning models online on Google Colab

Given a 2D triangle mesh, we could randomly generate cloud points that fill in the triangle mesh

Unofficial pytorch implementation of 'Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization'

Predicting lncRNA–protein interactions based on graph autoencoders and collaborative training

A PyTorch implementation of EfficientDet.

The world's largest toxicity dataset.

Rethinking Transformer-based Set Prediction for Object Detection

NeRD: Neural Reflectance Decomposition from Image Collections

Learning from Synthetic Data with Fine-grained Attributes for Person Re-Identification

Colar: Effective and Efficient Online Action Detection by Consulting Exemplars, CVPR 2022.

Official PyTorch implementation of the paper: DeepSIM: Image Shape Manipulation from a Single Augmented Training Sample

TACTO: A Fast, Flexible and Open-source Simulator for High-Resolution Vision-based Tactile Sensors

Spontaneous Facial Micro Expression Recognition using 3D Spatio-Temporal Convolutional Neural Networks

Data and Code for ACL 2021 Paper "Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning"

Rule Based Classification Project

Project page of the paper 'Analyzing Perception-Distortion Tradeoff using Enhanced Perceptual Super-resolution Network' (ECCVW 2018)