[ICRA 2022] An opensource framework for cooperative detection. Official implementation for OPV2V.

Overview

OpenCOOD

Documentation Status License: MIT

OpenCOOD is an Open COOperative Detection framework for autonomous driving. It is also the official implementation of the ICRA 2022 paper OPV2V.

News

03/17/2022: V2VNet is supported and the results/trained model are provided in the benchmark table.

03/10/2022: Results and pretrained weights for Attentive Fusion with compression are provided.

02/20/2022: F-Cooper now is supported and the results/traiend model can be found in the benchmark table.

01/31/2022: Our paper OPV2V: An Open Benchmark Dataset and Fusion Pipeline for Perception with Vehicle-to-Vehicle Communication has been accpted by ICRA2022!

09/21/2021: OPV2V dataset is public available: https://mobility-lab.seas.ucla.edu/opv2v/

Features

  • Provide easy data API for the Vehicle-to-Vehicle (V2V) multi-modal perception dataset OPV2V

    It currently provides easy API to load LiDAR data from multiple agents simultaneously in a structured format and convert to PyTorch Tesnor directly for model use.

  • Provide multiple SOTA 3D detection backbone

    It supports state-of-the-art LiDAR detector including PointPillar, Pixor, VoxelNet, and SECOND.

  • Support most common fusion strategies

    It includes 3 most common fusion strategies: early fusion, late fusion, and intermediate fusion across different agents.

  • Support several SOTA multi-agent visual fusion model

    It supports the most recent multi-agent perception algorithms (currently up to Sep. 2021) including Attentive Fusion, Cooper (early fusion), F-Cooper, V2VNet etc. We will keep updating the newest algorithms.

  • Provide a convenient log replay toolbox for OPV2V dataset (coming soon)

    It also provides an easy tool to replay the original OPV2V dataset. More importantly, it allows users to enrich the original dataset by attaching new sensors or define additional tasks (e.g. tracking, prediction) without changing the events in the initial dataset (e.g. positions and number of all vehicles, traffic speed).

Data Downloading

All the data can be downloaded from google drive. If you have a good internet, you can directly download the complete large zip file such as train.zip. In case you suffer from downloading large fiels, we also split each data set into small chunks, which can be found in the directory ending with _chunks, such as train_chunks. After downloading, please run the following command to each set to merge those chunks together:

cat train.zip.parta* > train.zip
unzip train.zip

Installation

Please refer to data introduction and installation guide to prepare data and install OpenCOOD. To see more details of OPV2V data, please check our website.

Quick Start

Data sequence visualization

To quickly visualize the LiDAR stream in the OPV2V dataset, first modify the validate_dir in your opencood/hypes_yaml/visualization.yaml to the opv2v data path on your local machine, e.g. opv2v/validate, and the run the following commond:

cd ~/OpenCOOD
python opencood/visualization/vis_data_sequence.py [--color_mode ${COLOR_RENDERING_MODE}]

Arguments Explanation:

  • color_mode : str type, indicating the lidar color rendering mode. You can choose from 'constant', 'intensity' or 'z-value'.

Train your model

OpenCOOD uses yaml file to configure all the parameters for training. To train your own model from scratch or a continued checkpoint, run the following commonds:

python opencood/tools/train.py --hypes_yaml ${CONFIG_FILE} [--model_dir  ${CHECKPOINT_FOLDER}]

Arguments Explanation:

  • hypes_yaml: the path of the training configuration file, e.g. opencood/hypes_yaml/second_early_fusion.yaml, meaning you want to train an early fusion model which utilizes SECOND as the backbone. See Tutorial 1: Config System to learn more about the rules of the yaml files.
  • model_dir (optional) : the path of the checkpoints. This is used to fine-tune the trained models. When the model_dir is given, the trainer will discard the hypes_yaml and load the config.yaml in the checkpoint folder.

Test the model

Before you run the following command, first make sure the validation_dir in config.yaml under your checkpoint folder refers to the testing dataset path, e.g. opv2v_data_dumping/test.

python opencood/tools/inference.py --model_dir ${CHECKPOINT_FOLDER} --fusion_method ${FUSION_STRATEGY} [--show_vis] [--show_sequence]

Arguments Explanation:

  • model_dir: the path to your saved model.
  • fusion_method: indicate the fusion strategy, currently support 'early', 'late', and 'intermediate'.
  • show_vis: whether to visualize the detection overlay with point cloud.
  • show_sequence : the detection results will visualized in a video stream. It can NOT be set with show_vis at the same time.

The evaluation results will be dumped in the model directory.

Benchmark and model zoo

Results on OPV2V dataset ([email protected] for no-compression/ compression)

Backbone Fusion Strategy Bandwidth (Megabit),
before/after compression
Default Towns Culver City Download
Naive Late PointPillar Late 0.024/0.024 0.781/0.781 0.668/0.668 url
Cooper PointPillar Early 7.68/7.68 0.800/x 0.696/x url
Attentive Fusion PointPillar Intermediate 126.8/1.98 0.815/0.810 0.735/0.731 url
F-Cooper PointPillar Intermediate 72.08/1.12 0.790/0.788 0.728/0.726 url
V2VNet PointPillar Intermediate 72.08/1.12 0.822/0.814 0.734/0.729 url
Naive Late VoxelNet Late 0.024/0.024 0.738/0.738 0.588/0.588 url
Cooper VoxelNet Early 7.68/7.68 0.758/x 0.677/x url
Attentive Fusion VoxelNet Intermediate 576.71/1.12 0.864/0.852 0.775/0.746 url
Naive Late SECOND Late 0.024/0.024 0.775/0.775 0.682/0.682 url
Cooper SECOND Early 7.68/7.68 0.813/x 0.738/x url
Attentive SECOND Intermediate 63.4/0.99 0.826/0.783 0.760/0.760 url
Naive Late PIXOR Late 0.024/0.024 0.578/0.578 0.360/0.360 url
Cooper PIXOR Early 7.68/7.68 0.678/x 0.558/x url
Attentive PIXOR Intermediate 313.75/1.22 0.687/0.612 0.546/0.492 url

Note:

  • We suggest using PointPillar as the backbone when you are creating your method and try to compare with our benchmark, as we implement most of the SOTA methods with this backbone only.
  • We assume the transimssion rate is 27Mbp/s. Considering the frequency of LiDAR is 10Hz, the bandwidth requirement should be less than 2.7Mbp to avoid severe delay.
  • A 'x' in the benchmark table represents the bandwidth requirement is too large, which can not be considered to employ in practice.

Tutorials

We have a series of tutorials to help you understand OpenCOOD more. Please check the series of our tutorials.

Citation

If you are using our OpenCOOD framework or OPV2V dataset for your research, please cite the following paper:

@inproceedings{xu2022opencood,
 author = {Runsheng Xu, Hao Xiang, Xin Xia, Xu Han, Jinlong Li, Jiaqi Ma},
 title = {OPV2V: An Open Benchmark Dataset and Fusion Pipeline for Perception with Vehicle-to-Vehicle Communication},
 booktitle = {2022 IEEE International Conference on Robotics and Automation (ICRA)},
 year = {2022}}

Also, under this LICENSE, OpenCOOD is for non-commercial research only. Researchers can modify the source code for their own research only. Contracted work that generates corporate revenues and other general commercial use are prohibited under this LICENSE. See the LICENSE file for details and possible opportunities for commercial use.

Future Plans

  • Provide camera APIs for OPV2V
  • Provide the log replay toolbox
  • Implement F-Cooper
  • Implement V2VNet
  • Implement DiscoNet

Contributors

OpenCOOD is supported by the UCLA Mobility Lab. We also appreciate the great work from OpenPCDet, as part of our works use their framework.

Lab Principal Investigator:

Project Lead:

Owner
Runsheng Xu
UCLA PHD candidate, Former Senior Machine Learning Engineer in Mercedes Benz R&D North America
Runsheng Xu
Source code for the paper "PLOME: Pre-training with Misspelled Knowledge for Chinese Spelling Correction" in ACL2021

PLOME:Pre-training with Misspelled Knowledge for Chinese Spelling Correction (ACL2021) This repository provides the code and data of the work in ACL20

197 Nov 26, 2022
General purpose GPU compute framework for cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends)

General purpose GPU compute framework for cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabled, asynchronous and optimized for advanced GPU data processing usec

The Kompute Project 1k Jan 06, 2023
2021 Artificial Intelligence Diabetes Datathon

A.I.D.D. 2021 2021 Artificial Intelligence Diabetes Datathon A.I.D.D. 2021은 ‘2021 인공지능 학습용 데이터 구축사업’을 통해 만들어진 학습용 데이터를 활용하여 당뇨병을 효과적으로 예측할 수 있는가에 대한 A

2 Dec 27, 2021
This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?”

This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?” Usage To replicate our results in Secti

Albert Webson 64 Dec 11, 2022
Add gui for YoloV5 using PyQt5

HEAD 更新2021.08.16 **添加图片和视频保存功能: 1.图片和视频按照当前系统时间进行命名 2.各自检测结果存放入output文件夹 3.摄像头检测的默认设备序号更改为0,减少调试报错 温馨提示: 1.项目放置在全英文路径下,防止项目报错 2.默认使用cpu进行检测,自

Ruihao Wang 65 Dec 27, 2022
The implementation of CVPR2021 paper Temporal Query Networks for Fine-grained Video Understanding, by Chuhan Zhang, Ankush Gupta and Andrew Zisserman.

Temporal Query Networks for Fine-grained Video Understanding 📋 This repository contains the implementation of CVPR2021 paper Temporal_Query_Networks

55 Dec 21, 2022
Relative Uncertainty Learning for Facial Expression Recognition

Relative Uncertainty Learning for Facial Expression Recognition The official implementation of the following paper at NeurIPS2021: Title: Relative Unc

35 Dec 28, 2022
Intel® Neural Compressor is an open-source Python library running on Intel CPUs and GPUs

Intel® Neural Compressor targeting to provide unified APIs for network compression technologies, such as low precision quantization, sparsity, pruning, knowledge distillation, across different deep l

Intel Corporation 846 Jan 04, 2023
TensorFlow implementation of Barlow Twins (Barlow Twins: Self-Supervised Learning via Redundancy Reduction)

Barlow-Twins-TF This repository implements Barlow Twins (Barlow Twins: Self-Supervised Learning via Redundancy Reduction) in TensorFlow and demonstrat

Sayak Paul 36 Sep 14, 2022
CAPRI: Context-Aware Interpretable Point-of-Interest Recommendation Framework

CAPRI: Context-Aware Interpretable Point-of-Interest Recommendation Framework This repository contains a framework for Recommender Systems (RecSys), a

RecSys Lab 8 Jul 03, 2022
Easy Parallel Library (EPL) is a general and efficient deep learning framework for distributed model training.

English | 简体中文 Easy Parallel Library Overview Easy Parallel Library (EPL) is a general and efficient library for distributed model training. Usability

Alibaba 185 Dec 21, 2022
OpenMMLab Image and Video Editing Toolbox

Introduction MMEditing is an open source image and video editing toolbox based on PyTorch. It is a part of the OpenMMLab project. The master branch wo

OpenMMLab 3.9k Jan 04, 2023
A curated list of the latest breakthroughs in AI (in 2021) by release date with a clear video explanation, link to a more in-depth article, and code.

2021: A Year Full of Amazing AI papers- A Review 📌 A curated list of the latest breakthroughs in AI by release date with a clear video explanation, l

Louis-François Bouchard 2.9k Dec 31, 2022
Baseline powergrid model for NY

Baseline-powergrid-model-for-NY Table of Contents About The Project Built With Usage License Contact Acknowledgements About The Project As the urgency

Anderson Energy Lab at Cornell 6 Nov 24, 2022
Alleviating Over-segmentation Errors by Detecting Action Boundaries

Alleviating Over-segmentation Errors by Detecting Action Boundaries Forked from ASRF offical code. This repo is the a implementation of replacing orig

13 Dec 12, 2022
Canonical Appearance Transformations

CAT-Net: Learning Canonical Appearance Transformations Code to accompany our paper "How to Train a CAT: Learning Canonical Appearance Transformations

STARS Laboratory 54 Dec 24, 2022
This is the official code for the paper "Ad2Attack: Adaptive Adversarial Attack for Real-Time UAV Tracking".

Ad^2Attack:Adaptive Adversarial Attack on Real-Time UAV Tracking Demo video 📹 Our video on bilibili demonstrates the test results of Ad^2Attack on se

Intelligent Vision for Robotics in Complex Environment 10 Nov 07, 2022
Husein pet projects in here!

project-suka-suka Husein pet projects in here! List of projects mysejahtera-density. Generate resolution points using meshgrid and request each points

HUSEIN ZOLKEPLI 47 Dec 09, 2022
TalkingHead-1KH is a talking-head dataset consisting of YouTube videos

TalkingHead-1KH Dataset TalkingHead-1KH is a talking-head dataset consisting of YouTube videos, originally created as a benchmark for face-vid2vid: On

173 Dec 29, 2022
Namish Khanna 40 Oct 11, 2022