[ICRA 2022] An opensource framework for cooperative detection. Official implementation for OPV2V.

Overview

OpenCOOD

Documentation Status License: MIT

OpenCOOD is an Open COOperative Detection framework for autonomous driving. It is also the official implementation of the ICRA 2022 paper OPV2V.

News

03/17/2022: V2VNet is supported and the results/trained model are provided in the benchmark table.

03/10/2022: Results and pretrained weights for Attentive Fusion with compression are provided.

02/20/2022: F-Cooper now is supported and the results/traiend model can be found in the benchmark table.

01/31/2022: Our paper OPV2V: An Open Benchmark Dataset and Fusion Pipeline for Perception with Vehicle-to-Vehicle Communication has been accpted by ICRA2022!

09/21/2021: OPV2V dataset is public available: https://mobility-lab.seas.ucla.edu/opv2v/

Features

  • Provide easy data API for the Vehicle-to-Vehicle (V2V) multi-modal perception dataset OPV2V

    It currently provides easy API to load LiDAR data from multiple agents simultaneously in a structured format and convert to PyTorch Tesnor directly for model use.

  • Provide multiple SOTA 3D detection backbone

    It supports state-of-the-art LiDAR detector including PointPillar, Pixor, VoxelNet, and SECOND.

  • Support most common fusion strategies

    It includes 3 most common fusion strategies: early fusion, late fusion, and intermediate fusion across different agents.

  • Support several SOTA multi-agent visual fusion model

    It supports the most recent multi-agent perception algorithms (currently up to Sep. 2021) including Attentive Fusion, Cooper (early fusion), F-Cooper, V2VNet etc. We will keep updating the newest algorithms.

  • Provide a convenient log replay toolbox for OPV2V dataset (coming soon)

    It also provides an easy tool to replay the original OPV2V dataset. More importantly, it allows users to enrich the original dataset by attaching new sensors or define additional tasks (e.g. tracking, prediction) without changing the events in the initial dataset (e.g. positions and number of all vehicles, traffic speed).

Data Downloading

All the data can be downloaded from google drive. If you have a good internet, you can directly download the complete large zip file such as train.zip. In case you suffer from downloading large fiels, we also split each data set into small chunks, which can be found in the directory ending with _chunks, such as train_chunks. After downloading, please run the following command to each set to merge those chunks together:

cat train.zip.parta* > train.zip
unzip train.zip

Installation

Please refer to data introduction and installation guide to prepare data and install OpenCOOD. To see more details of OPV2V data, please check our website.

Quick Start

Data sequence visualization

To quickly visualize the LiDAR stream in the OPV2V dataset, first modify the validate_dir in your opencood/hypes_yaml/visualization.yaml to the opv2v data path on your local machine, e.g. opv2v/validate, and the run the following commond:

cd ~/OpenCOOD
python opencood/visualization/vis_data_sequence.py [--color_mode ${COLOR_RENDERING_MODE}]

Arguments Explanation:

  • color_mode : str type, indicating the lidar color rendering mode. You can choose from 'constant', 'intensity' or 'z-value'.

Train your model

OpenCOOD uses yaml file to configure all the parameters for training. To train your own model from scratch or a continued checkpoint, run the following commonds:

python opencood/tools/train.py --hypes_yaml ${CONFIG_FILE} [--model_dir  ${CHECKPOINT_FOLDER}]

Arguments Explanation:

  • hypes_yaml: the path of the training configuration file, e.g. opencood/hypes_yaml/second_early_fusion.yaml, meaning you want to train an early fusion model which utilizes SECOND as the backbone. See Tutorial 1: Config System to learn more about the rules of the yaml files.
  • model_dir (optional) : the path of the checkpoints. This is used to fine-tune the trained models. When the model_dir is given, the trainer will discard the hypes_yaml and load the config.yaml in the checkpoint folder.

Test the model

Before you run the following command, first make sure the validation_dir in config.yaml under your checkpoint folder refers to the testing dataset path, e.g. opv2v_data_dumping/test.

python opencood/tools/inference.py --model_dir ${CHECKPOINT_FOLDER} --fusion_method ${FUSION_STRATEGY} [--show_vis] [--show_sequence]

Arguments Explanation:

  • model_dir: the path to your saved model.
  • fusion_method: indicate the fusion strategy, currently support 'early', 'late', and 'intermediate'.
  • show_vis: whether to visualize the detection overlay with point cloud.
  • show_sequence : the detection results will visualized in a video stream. It can NOT be set with show_vis at the same time.

The evaluation results will be dumped in the model directory.

Benchmark and model zoo

Results on OPV2V dataset ([email protected] for no-compression/ compression)

Backbone Fusion Strategy Bandwidth (Megabit),
before/after compression
Default Towns Culver City Download
Naive Late PointPillar Late 0.024/0.024 0.781/0.781 0.668/0.668 url
Cooper PointPillar Early 7.68/7.68 0.800/x 0.696/x url
Attentive Fusion PointPillar Intermediate 126.8/1.98 0.815/0.810 0.735/0.731 url
F-Cooper PointPillar Intermediate 72.08/1.12 0.790/0.788 0.728/0.726 url
V2VNet PointPillar Intermediate 72.08/1.12 0.822/0.814 0.734/0.729 url
Naive Late VoxelNet Late 0.024/0.024 0.738/0.738 0.588/0.588 url
Cooper VoxelNet Early 7.68/7.68 0.758/x 0.677/x url
Attentive Fusion VoxelNet Intermediate 576.71/1.12 0.864/0.852 0.775/0.746 url
Naive Late SECOND Late 0.024/0.024 0.775/0.775 0.682/0.682 url
Cooper SECOND Early 7.68/7.68 0.813/x 0.738/x url
Attentive SECOND Intermediate 63.4/0.99 0.826/0.783 0.760/0.760 url
Naive Late PIXOR Late 0.024/0.024 0.578/0.578 0.360/0.360 url
Cooper PIXOR Early 7.68/7.68 0.678/x 0.558/x url
Attentive PIXOR Intermediate 313.75/1.22 0.687/0.612 0.546/0.492 url

Note:

  • We suggest using PointPillar as the backbone when you are creating your method and try to compare with our benchmark, as we implement most of the SOTA methods with this backbone only.
  • We assume the transimssion rate is 27Mbp/s. Considering the frequency of LiDAR is 10Hz, the bandwidth requirement should be less than 2.7Mbp to avoid severe delay.
  • A 'x' in the benchmark table represents the bandwidth requirement is too large, which can not be considered to employ in practice.

Tutorials

We have a series of tutorials to help you understand OpenCOOD more. Please check the series of our tutorials.

Citation

If you are using our OpenCOOD framework or OPV2V dataset for your research, please cite the following paper:

@inproceedings{xu2022opencood,
 author = {Runsheng Xu, Hao Xiang, Xin Xia, Xu Han, Jinlong Li, Jiaqi Ma},
 title = {OPV2V: An Open Benchmark Dataset and Fusion Pipeline for Perception with Vehicle-to-Vehicle Communication},
 booktitle = {2022 IEEE International Conference on Robotics and Automation (ICRA)},
 year = {2022}}

Also, under this LICENSE, OpenCOOD is for non-commercial research only. Researchers can modify the source code for their own research only. Contracted work that generates corporate revenues and other general commercial use are prohibited under this LICENSE. See the LICENSE file for details and possible opportunities for commercial use.

Future Plans

  • Provide camera APIs for OPV2V
  • Provide the log replay toolbox
  • Implement F-Cooper
  • Implement V2VNet
  • Implement DiscoNet

Contributors

OpenCOOD is supported by the UCLA Mobility Lab. We also appreciate the great work from OpenPCDet, as part of our works use their framework.

Lab Principal Investigator:

Project Lead:

Owner
Runsheng Xu
UCLA PHD candidate, Former Senior Machine Learning Engineer in Mercedes Benz R&D North America
Runsheng Xu
TrackFormer: Multi-Object Tracking with Transformers

TrackFormer: Multi-Object Tracking with Transformers This repository provides the official implementation of the TrackFormer: Multi-Object Tracking wi

Tim Meinhardt 321 Dec 29, 2022
Stochastic Scene-Aware Motion Prediction

Stochastic Scene-Aware Motion Prediction [Project Page] [Paper] Description This repository contains the training code for MotionNet and GoalNet of SA

Mohamed Hassan 31 Dec 09, 2022
DrWhy is the collection of tools for eXplainable AI (XAI). It's based on shared principles and simple grammar for exploration, explanation and visualisation of predictive models.

Responsible Machine Learning With Great Power Comes Great Responsibility. Voltaire (well, maybe) How to develop machine learning models in a responsib

Model Oriented 590 Dec 26, 2022
[CVPR 2022 Oral] Versatile Multi-Modal Pre-Training for Human-Centric Perception

Versatile Multi-Modal Pre-Training for Human-Centric Perception Fangzhou Hong1  Liang Pan1  Zhongang Cai1,2,3  Ziwei Liu1* 1S-Lab, Nanyang Technologic

Fangzhou Hong 96 Jan 03, 2023
Code repository for the paper "Doubly-Trained Adversarial Data Augmentation for Neural Machine Translation" with instructions to reproduce the results.

Doubly Trained Neural Machine Translation System for Adversarial Attack and Data Augmentation Languages Experimented: Data Overview: Source Target Tra

Steven Tan 1 Aug 18, 2022
LoFTR:Detector-Free Local Feature Matching with Transformers CVPR 2021

LoFTR-with-train-script LoFTR:Detector-Free Local Feature Matching with Transformers CVPR 2021 (with train script --- unofficial ---). About Megadepth

Nan Xiaohu 15 Nov 04, 2022
A PyTorch-based Semi-Supervised Learning (SSL) Codebase for Pixel-wise (Pixel) Vision Tasks

PixelSSL is a PyTorch-based semi-supervised learning (SSL) codebase for pixel-wise (Pixel) vision tasks. The purpose of this project is to promote the

Zhanghan Ke 255 Dec 11, 2022
Official repository of the paper "A Variational Approximation for Analyzing the Dynamics of Panel Data". Mixed Effect Neural ODE. UAI 2021.

Official repository of the paper (UAI 2021) "A Variational Approximation for Analyzing the Dynamics of Panel Data", Mixed Effect Neural ODE. Panel dat

Jurijs Nazarovs 7 Nov 26, 2022
Turn based roguelike in python

pyTB Turn based roguelike in python Documentation can be found here: http://mcgillij.github.io/pyTB/index.html Screenshot Dependencies Written in Pyth

Jason McGillivray 4 Sep 29, 2022
Semantic similarity computation with different state-of-the-art metrics

Semantic similarity computation with different state-of-the-art metrics Description • Installation • Usage • License Description TaxoSS is a semantic

6 Jun 22, 2022
Pytorch implementation of few-shot semantic image synthesis

Few-shot Semantic Image Synthesis Using StyleGAN Prior Our method can synthesize photorealistic images from dense or sparse semantic annotations using

40 Sep 26, 2022
pip install python-office

🍬 python for office 👉 http://www.python4office.cn/ 👈 🌎 English Documentation 📚 简介 Python-office 是一个 Python 自动化办公第三方库,能解决大部分自动化办公的问题。而且每个功能只需一行代码,

程序员晚枫 272 Dec 29, 2022
Lung Pattern Classification for Interstitial Lung Diseases Using a Deep Convolutional Neural Network

ild-cnn This is supplementary material for the manuscript: "Lung Pattern Classification for Interstitial Lung Diseases Using a Deep Convolutional Neur

22 Nov 05, 2022
[ICML 2021] “ Self-Damaging Contrastive Learning”, Ziyu Jiang, Tianlong Chen, Bobak Mortazavi, Zhangyang Wang

Self-Damaging Contrastive Learning Introduction The recent breakthrough achieved by contrastive learning accelerates the pace for deploying unsupervis

VITA 51 Dec 29, 2022
MNE: Magnetoencephalography (MEG) and Electroencephalography (EEG) in Python

MNE-Python MNE-Python software is an open-source Python package for exploring, visualizing, and analyzing human neurophysiological data such as MEG, E

MNE tools for MEG and EEG data analysis 2.1k Dec 28, 2022
Learning Super-Features for Image Retrieval

Learning Super-Features for Image Retrieval This repository contains the code for running our FIRe model presented in our ICLR'22 paper: @inproceeding

NAVER 101 Dec 28, 2022
Gym for multi-agent reinforcement learning

PettingZoo is a Python library for conducting research in multi-agent reinforcement learning, akin to a multi-agent version of Gym. Our website, with

Farama Foundation 1.6k Jan 09, 2023
Official repository for "Action-Based Conversations Dataset: A Corpus for Building More In-Depth Task-Oriented Dialogue Systems"

Action-Based Conversations Dataset (ABCD) This respository contains the code and data for ABCD (Chen et al., 2021) Introduction Whereas existing goal-

ASAPP Research 49 Oct 09, 2022
BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation

BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation This is a demo implementation of BYOL for Audio (BYOL-A), a self-sup

NTT Communication Science Laboratories 160 Jan 04, 2023
A framework that constructs deep neural networks, autoencoders, logistic regressors, and linear networks

A framework that constructs deep neural networks, autoencoders, logistic regressors, and linear networks without the use of any outside machine learning libraries - all from scratch.

Kordel K. France 2 Nov 14, 2022