Moving Object Segmentation in 3D LiDAR Data: A Learning-based Approach Exploiting Sequential Data

Last update: Dec 29, 2022

Overview

LiDAR-MOS: Moving Object Segmentation in 3D LiDAR Data

This repo contains the code for our paper: Moving Object Segmentation in 3D LiDAR Data: A Learning-based Approach Exploiting Sequential Data PDF.

Our approach accurately segments the scene into moving and static objects, i.e., distinguishing between moving cars vs. parked cars. It runs faster than the frame rate of the sensor and can be used to improve 3D LiDAR-based odometry/SLAM and mapping results as shown below.

Additionally, we created a new benchmark for LiDAR-based moving object segmentation based on SemanticKITTI here.

Complete demo video can be found in YouTube here. LiDAR-MOS in action:

Introduction of the repo and benchmark
Publication
Dependencies
How to use
Applications
Collection of downloads
License

Publication

If you use our code and benchmark in your academic work, please cite the corresponding paper:

@article{chen2021ral,
    title={{Moving Object Segmentation in 3D LiDAR Data: A Learning-based Approach Exploiting Sequential Data}},
    author={X. Chen and S. Li and B. Mersch and L. Wiesmann and J. Gall and J. Behley and C. Stachniss},
    year={2021},
    journal={IEEE Robotics and Automation Letters (RA-L)},
    doi = {10.1109/LRA.2021.3093567},
    issn = {2377-3766},
}

Dependencies

We built and tested our work based on SalsaNext, RangeNet++ and MINet. We thank the original authors for their nice work and implementation. If you are interested in fast LiDAR-based semantic segmentation, we strongly recommend having a look at the original repositories.

Note that, in this repo, we show that how easily we could achieve LiDAR-based moving object segmentation exploiting sequential information with existing segmentation networks. We didn't change the original pipeline of the segmentation networks, but only changed the data loader and input of the network as shown in the figure below. Therefore, our method can be used with any range-image-based LiDAR segmentation networks.

Our method is based on range images. To use range projection with fast c++ library, please find the usage doc here.

How to use

For a quick test of all the steps below, one could download a toy dataset here and decompress it in the data\ folder following the data structure data/README.md.

Prepare training data

To use our method, one needs to generate the residual images. Here is a quick demo:

  $ python3 utils/gen_residual_images.py

More setup about the data preparation can be found in the yaml file config/data_preparing.yaml. To prepare the training data for the whole KITTI-Odometry dataset, please download the original website.

Using SalsaNext as the baseline

To use SalsaNext as the baseline segmentation network for LiDAR-MOS, one should follow the mos_SalsaNext/README.md to it up.

Inferring

To generate the LiDAR-MOS predictions with pretrained model. Quick test on toy dataset, directly run

  $ cd mos_SalsaNext/train/tasks/semantic
  $ python3 infer.py -d ../../../../data -m ../../../../data/model_salsanext_residual_1 -l ../../../../data/predictions_salsanext_residual_1_new -s valid

Inferring the whole dataset, please download the KITTI-Odometry dataset from the original website, and change the corresponding paths.

  $ cd mos_SalsaNext/train/tasks/semantic
  $ python3 infer.py -d path/to/kitti/dataset -m path/to/pretrained_model -l path/to/log -s train/valid/test # depending of desired split to evaluate

Training

To train a LiDAR-MOS network with SalsaNext from scratch, one has to download the KITTI-Odometry dataset and Semantic-Kitti dataset: Change the corresponding paths and run:

  $ cd mos_SalsaNext/train/tasks/semantic
  $ ./train.sh -d path/to/kitti/dataset -a salsanext_mos.yml -l path/to/log -c 0  # the number of used gpu cores

Using RangeNet++ as the baseline

To use RangeNet++ as the baseline segmentation network for LiDAR-MOS, one should follow the mos_RangeNet/README.md to set it up.

Inferring

Inferring the whole dataset, please download the KITTI-Odometry dataset from the original website, the [pretrained model](todo: add pretrained model for rangenet) and change the corresponding paths.

  $ cd mos_RangeNet/tasks/semantic
  $ python3 infer.py -d path/to/kitti/dataset -m path/to/pretrained_model -l path/to/log -s train/valid/test # depending of desired split to evaluate

Training

To train a LiDAR-MOS network with RangeNet++ from scratch, one has to download the KITTI-Odometry dataset and Semantic-Kitti dataset and change the corresponding paths and run:

  $ cd mos_RangeNet/tasks/semantic
  $ python3 train.py -d path/to/kitti/dataset -ac rangenet_mos.yaml -l path/to/log

More pretrained model and LiDAR-MOS predictions can be found in collection of downloads.

Evaluation and visualization

How to evaluate

Evaluation metrics. Let's call the moving (dynamic) status as D and the static status as S.

Since we ignore the unlabelled and invalid status, therefore in MOD there are only two classes.

GT\Prediction	dynamic	static
dynamic	TD	FS
static	FD	TS

$$ IoU_{MOS} = \frac{TD}{TD+FD+FS} $$

To evaluate the MOS results on the toy dataset just run:

  $ python3 utils/evaluate_mos.py -d data -p data/predictions_salsanext_residual_1_valid -s valid

To evaluate the MOS results on our LiDAR-MOS benchmark please have a look at our semantic-kitti-api and benchmark website.

How to visualize the predictions

To visualize the MOS results on the toy dataset just run:

  $ python3 utils/visualize_mos.py -d data -p data/predictions_salsanext_residual_1_valid -s 8  # here we use a specific sequence number

where:

sequence is the sequence to be accessed.
dataset is the path to the kitti dataset where the sequences directory is.

Navigation:

n is next scan,
b is previous scan,
esc or q exits.

Applications

LiDAR-MOS is very important for building consistent maps, making future state predictions, avoiding collisions, and planning. It can also improve and robustify pose estimation, sensor data registration, and SLAM. Here we show two obvious applications of our LiDAR-MOS which are LiDAR-based odometry/SLAM as well as 3D mapping. Before that, we show two simple examples of how to combine our method with semantics and clean the scans. After cleaning scans we can get better odometry/SLAM and 3D mapping results.

Note that, here we show two direct use cases of our MOS approach without any further optimizations employed.

Enhanced with semantics

To show a simple way of combining our LiDAR-MOS with semantics, we provide a quick demo with the toy dataset:

  $ python3 utils/combine_semantics.py

It just simply checks whether the moving objects are movable classes or not. If not, re-assigned as static.

Clean the scans

To clean the LiDAR scans with our LiDAR-MOS as masks, we also provide a quick demo on the toy dataset:

  $ python3 utils/scan_cleaner.py

Odometry/SLAM

Using the cleaned LiDAR scans, we see that by simply applying our MOS predictions as a preprocessing mask, the odometry results are improved in both the KITTI training and test data and even slightly better than the carefully-designed full classes semantic-enhanced SuMa++.

The testing results of our methods can also be found in KITTI-Odometry benchmark.

Mapping

we compare the aggregated point cloud maps (left) directly with the raw LiDAR scans, (right) with the cleaned LiDAR scans by applying our MOS predictions as masks. As can be seen, there are moving objects present that pollute the map, which might have adversarial effects, when used for localization or path planning. By using our MOS predictions as masks, we can effectively remove these artifacts and get a clean map.

Collection of downloads

LiDAR_MOS_toy_dataset (toy dataset used for the quick demos)
predictions_salsanext_semantic (semantic segmentation results from SalsaNext for all sequences 00 - 21)
predictions_salsanext_residual_8_sem (Our best! LiDAR-MOS results using SalsaNext with 8 residual images + semantics)
model_rangenet_residual_1 (pretrained model using RangeNet++ with 1 residual image)
model_salsanext_residual_1 (pretrained model using SalsaNext with 1 residual image)
model_salsanext_residual_8 (pretrained model using SalsaNext with 8 residual image)

License

This project is free software made available under the MIT License. For details see the LICENSE file.

Comments

How to use SalsaNet with my own dataset?

Hi, I have read the paper and built-and-run your LiDAR-MOS.
Thanks for sharing your awesome projects here.

I have a question.
How can I use the code with my own dataset?
I'll use the pretrained model you uploaded, so I think all I need to do is make my data be appropriate format for the LiDAR-MOS.
The data I have consists of .bag file and .pcd file.

I'd appreciate for you to give me some advice.

Best regards.

opened by bigbigpark 15
Questions about LIDAR-MOS visualization

Hello, your LIDAR-MOS code is very good, but I have a problem that cannot be visualized when reproducing your code, as shown in the figure: After running this command, the program seems to be stuck. I don't know why, I want to get your visualized results, as shown below: ps：Author reply：From the results of the operation, it seems to be a tkinter problem.But the tkinker module does not seem to be missing. If anyone knows how to solve it, hope you can help me, thanks.

opened by MrNeoJeep 13
TRAIN BUGS

Thanks for your quick response! @Chen-Xieyuanli

I trained with salsa_mos in semantic kitti. When it run here Lr: 3.977e-03 | Update: 3.258e-04 mean,5.209e-04 std | Epoch: [0][950/2391] | Time 0.641 (0.623) | Data 0.081 (0.067) | Loss 0.6863 (0.9777) | acc 0.830 (0.855) | IoU 0.417 (0.434) | [1 day, 3:19:41] LiDAR-MOS/mos_SalsaNext/train/tasks/semantic/../../common/laserscan.py:166: RuntimeWarning: invalid value encountered in true_divide pitch = np.arcsin(scan_z / depth) i got error File "conda/envs/salsa/lib/python3.9/site-packages/torch/_utils.py", line 457, in reraise raise exception IndexError: Caught IndexError in DataLoader worker process 1. and File "LiDAR-MOS/mos_SalsaNext/train/tasks/semantic/../../common/laserscan.py", line 201, in do_range_projection self.proj_range[proj_y, proj_x] = depth IndexError: index -2147483648 is out of bounds for axis 0 with size 64 The conda env is `- python=3.9.12=h12debd9_0

python-flatbuffers=1.12=pyhd3eb1b0_0 pytorch=1.11.0=py3.9_cuda11.3_cudnn8.2.0_0 pytorch-mutex=1.0=cuda tensorflow-base=2.6.0=mkl_py39h3d85931_0 tensorflow-estimator=2.6.0=pyh7b7c402_0` Is there any wrong with this env？ Or it is too new? Or there is something wrong with data in residual images?

opened by LiXiang0021 10
Multiple input frames

Thank you for your great code!

I notice in rangenet_mos.yaml and salsanext_mos.yaml you only use one input frame. If I want to use multiple scans for training, how can I do it?

As far as I know, I need to change n_input_scans in both backbone and dataset, and transform in the dataset. What else do I need?

Another question is about the pose transform. When using a sequence of scans to train the model, why do you transform the pose to the last scan but not the first scan? here

opened by Trainingzy 9
Migrating the model to Livox Horizon

Thanks for your excellent work. I download and infer the toy dataset you provide and the performance is good. Now I want to use the model in my Livox Horizon with 80x25 FoV and similar point density to 64-line LiDAR. When I use your tools to generate range image and residual image, I just change the pose.txt and calib.txt to my own and everything is alright, I can generate correct range image and residual image. But when I try to infer my data using the model, I meet this error the range image generation function in salsanext. Are there any possible reasons?

opened by Psyclonus2887 8
dims of multi_residuals_images, thanks!

Dear author,

IF n_input_scans =2, so dims of proj_full is 12? (x,y ,z ,r,e, x,y,z,r,e, residual1 ,resudual2)?

Is that right?

I'm so sorry to bother you. That's really confused me.

Thanks.

opened by emilyemliyM 8
Question about loading the pretrained salsanext model

Hi!

Thanks so much for the codes! I've a question about loading the pretrained salsanext model. When I followed the steps outlined in the "How to use" section and tested on the toy example, I came across the issue when trying to run infer.py on the toy dataset (python3 infer.py -d ../../../../data -m ../../../../data/model_salsanext_residual_1 -l ../../../../data/predictions_salsanext_residual_1_new -s valid) and got this error:

RuntimeError: ../../../../data/model_salsanext_residual_1/SalsaNext_valid_best is a zip archive (did you mean to use torch.jit.load()?)

I tried to switch from torch.load() to torch.jit.load() in the user.py as it suggested but it leads to other errors. What did I do wrong, or did I miss something along the way? I set up the environment according to the instruction linked on the github (using Pytorch 1.1).

Thank you in advance for your help!

opened by maneekwant 8

How can i train 'SalsaNext' successfully?（训练'SalsaNext'时侯出现了问题）

Hi, thanks for sharing your great code. I'm just trying to do whole process of your works. but I can't train SalsaNext,

I tried :

./train.sh -d ../../../../dataset/KITTI_dataset/velodyne_laser/dataset/ -a salsanext_mos.yml -l logs/ -c 0

learning process arrived at :

Lr: 5.944e-03 | Update: 2.381e-04 mean,3.611e-04 std | Epoch: [0][11370/19130] | Time 0.203 (0.204) | Data 0.030 (0.031) | Loss 0.3839 (0.2800) | acc 0.962 (0.980) | IoU 0.685 (0.517) | [7 days, 18:10:40]

and the error msgs i got :

proj_full = torch.cat([proj_full, torch.unsqueeze(eval("proj_residuals_" + str(i+1)), 0)])
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 2048 and 2058 in dimension 2 at ../aten/src/TH/generic/THTensor.cpp:711

Is the conda environment setting wrong? currently using :

tensorboard               1.13.1                   pypi_0    pypi
tensorboard-data-server   0.6.1                    pypi_0    pypi
tensorboard-plugin-wit    1.8.0                    pypi_0    pypi
tensorflow                1.13.1                   pypi_0    pypi
tensorflow-estimator      1.13.0                   pypi_0    pypi

Or, can you check again links Collection of downloads ? seems like can't access now. Thanks you !

bug

opened by Sunghooon 7

prediction labels in toy dataset

Hi,

I see there are some segmentation results already present in the toy dataset.

The one that says salsanext, does it use one residual image?

Best Regards Sambit

opened by SM1991CODES 6
How to use the pretrained model to test my own Lidar Scans？
Thanks for your work！ Now I want to clean my own Lidar Scans.

Use infer.py to get mos predictions label

Then use utils/scan_cleaner.py to clean Is that right? if no, can you give me some advice? Thanks a lot!
opened by Cxz-dev 5
how to change the "n_input_scans"?

I only change the "arch_cfg.yaml" in model ,but there is a error:size mismatch for module.downCntx.conv1.weight: copying a param with shape torch.Size([32, 6, 1, 1]) from checkpoint, the shape in current model is torch.Size([32, 13, 1, 1]),so how to change the "n_input_scans"?thank you :)

opened by pikaqiu0131 5
About set my own Lidar pose to kitti format pose

Hi @Chen-Xieyuanli ! Thanks for your code！ Now I have another question, I know the code need pose as input(such as gen_residual_images.py), but when I use my own lidar without camera, how can I get the pose.txt from lidar odemetry? I think the pose need to be transformed in LiDAR coord, so can I just use the pose(1*3) based in Lidar coord which came from lidar odemetry as input?

opened by Cxz-dev 3
How to re-train when training is unexpectedly interrupted?

Hi，thank you for your wonderful open source，I would like to know whether it can continue to train from the last time when training is unexpectedly interrupted? If can，how to do it ？

opened by beyounged 8

RuntimeError("grad can be implicitly created only for scalar outputs")

Hello, thank you for your great work. I am training my own dataset and encountered the following error.

Ignoring class  0  in IoU evaluation
[IOU EVAL] IGNORE:  tensor([0])
[IOU EVAL] INCLUDE:  tensor([1, 2])
Lr: 3.106e-05 | Update: 2.258e-01 mean,4.181e-01 std | Epoch: [0][0/322] | Time 3.170 (3.170) | Data 0.154 (0.154) | Loss 1.9250 (1.9250) | acc 0.533 (0.533) | IoU 0.363 (0.363) | [1 day, 20:35:54]
Traceback (most recent call last):
  File "/content/LiDAR-MOS/mos_SalsaNext/train/tasks/semantic/train.py", line 178, in <module>
    trainer.train()
  File "../../tasks/semantic/modules/trainer.py", line 274, in train
    show_scans=self.ARCH["train"]["show_scans"])
  File "../../tasks/semantic/modules/trainer.py", line 391, in train_epoch
    loss_m.backward()
  File "/usr/local/lib/python3.7/dist-packages/torch/_tensor.py", line 396, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "/usr/local/lib/python3.7/dist-packages/torch/autograd/__init__.py", line 166, in backward
    grad_tensors_ = _make_grads(tensors, grad_tensors_, is_grads_batched=False)
  File "/usr/local/lib/python3.7/dist-packages/torch/autograd/__init__.py", line 67, in _make_grads
    raise RuntimeError("grad can be implicitly created only for scalar outputs")
RuntimeError: grad can be implicitly created only for scalar outputs

  optimizer.zero_grad()
            if self.n_gpus > 1:
                idx = torch.ones(self.n_gpus).cuda()
                loss_m.backward(idx)
            else:
                loss_m.backward() #here i got the error
            optimizer.step()

I have looked the error in google, and it usually happens when you use two or more GPUs. However, I am using only one GPU and got this error. Could you please help me to solve this error.

opened by e1339g 1

Training setups (tested with different GPUs)

Dear author,

Thanks for the sharing code.

I'm trying to reproduce the metrics from the paper, but haven't been successful yet. I would like to ask about some training parameters and hardware equipment for the experiment? Regarding the indicators such as iou in the paper, do you mean miou or just the iou of the moving class?

Thanks！
good first issue

opened by emilyemliyM 6
Tweaking the model for partial azimuth FOV Lidar

Hi, My Lidar's azimuth FOV is only ~100 [deg] . What would be the best way to tweak the model or some configuration so it will work? Currently the range images (and also the residual images) are very sparse at the right and left sides and I think that is one of the reason for the bad performance I get. Thanks

opened by boazMgm 7

Releases(v1.1)

v1.1(Sep 6, 2021)

Thanks Jiadai Sun for testing and correcting some bugs of SalsaNext-MOS
Source code(tar.gz)
Source code(zip)
v1.0(Sep 6, 2021)

The fisrt open-source version
Source code(tar.gz)
Source code(zip)

Owner

Photogrammetry & Robotics Bonn

Photogrammetry & Robotics Lab at the University of Bonn

GitHub Repository

Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning

We challenge a common assumption underlying most supervised deep learning: that a model makes a prediction depending only on its parameters and the features of a single input. To this end, we introdu

360 Dec 28, 2022

Fast and accurate optimisation for registration with little learningconvexadam

convexAdam Learn2Reg 2021 Submission Fast and accurate optimisation for registration with little learning Excellent results on Learn2Reg 2021 challeng

17 Dec 06, 2022

Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network)

Deep Daze mist over green hills shattered plates on the grass cosmic love and attention a time traveler in the crowd life during the plague meditative

4.4k Jan 03, 2023

Decoding the Protein-ligand Interactions Using Parallel Graph Neural Networks

Decoding the Protein-ligand Interactions Using Parallel Graph Neural Networks Requirements python 0.10+ rdkit 2020.03.3.0 biopython 1.78 openbabel 2.4

3 Nov 23, 2022

Python tools for 3D face: 3DMM, Mesh processing(transform, camera, light, render), 3D face representations.

face3d: Python tools for processing 3D face Introduction This project implements some basic functions related to 3D faces. You can use this to process

2.3k Dec 30, 2022

Emulation and Feedback Fuzzing of Firmware with Memory Sanitization

BaseSAFE This repository contains the BaseSAFE Rust APIs, introduced by "BaseSAFE: Baseband SAnitized Fuzzing through Emulation". The example/ directo

138 Dec 16, 2022

Generative Adversarial Text-to-Image Synthesis

###Generative Adversarial Text-to-Image Synthesis Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, Honglak Lee This is the

883 Dec 31, 2022

This repository contains the source code of Auto-Lambda and baselines from the paper, Auto-Lambda: Disentangling Dynamic Task Relationships.

Auto-Lambda This repository contains the source code of Auto-Lambda and baselines from the paper, Auto-Lambda: Disentangling Dynamic Task Relationship

76 Dec 20, 2022

Visualizer using audio and semantic analysis to explore BigGAN (Brock et al., 2018) latent space.

BigGAN Audio Visualizer Description This visualizer explores BigGAN (Brock et al., 2018) latent space by using pitch/tempo of an audio file to generat

2 Nov 21, 2022

ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection

ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection This repository contains implementation of the

190 Dec 30, 2022

Tutoriais publicados nas nossas redes sociais para obtenção de dados, análises simples e outras tarefas relevantes no mercado financeiro.

Tutoriais Públicos Tutoriais publicados nas nossas redes sociais para obtenção de dados, análises simples e outras tarefas relevantes no mercado finan

68 Oct 15, 2022

Deploy pytorch classification model using Flask and Streamlit

1 Nov 17, 2021

Forecasting directional movements of stock prices for intraday trading using LSTM and random forest

Forecasting directional movements of stock-prices for intraday trading using LSTM and random-forest https://arxiv.org/abs/2004.10178 Pushpendu Ghosh,

270 Dec 24, 2022

A demo of how to use JAX to create a simple gravity simulation

JAX Gravity This repo contains a demo of how to use JAX to create a simple gravity simulation. It uses JAX's experimental ode package to solve the dif

16 Sep 22, 2022

OoD Minimum Anomaly Score GAN - Code for the Paper 'OMASGAN: Out-of-Distribution Minimum Anomaly Score GAN for Sample Generation on the Boundary'

OMASGAN: Out-of-Distribution Minimum Anomaly Score GAN for Sample Generation on the Boundary Out-of-Distribution Minimum Anomaly Score GAN (OMASGAN) C

8 Sep 27, 2022

Moving Object Segmentation in 3D LiDAR Data: A Learning-based Approach Exploiting Sequential Data

Related tags

Overview

LiDAR-MOS: Moving Object Segmentation in 3D LiDAR Data

Table of Contents

Publication

Dependencies

How to use

Prepare training data

Using SalsaNext as the baseline

Inferring

Training

Using RangeNet++ as the baseline

Inferring

Training

Evaluation and visualization

How to evaluate

How to visualize the predictions

Applications

Enhanced with semantics

Clean the scans

Odometry/SLAM

Mapping

Collection of downloads

License

Comments

Releases(v1.1)

v1.1(Sep 6, 2021)

v1.0(Sep 6, 2021)

Owner

Photogrammetry & Robotics Bonn

Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning

Fast and accurate optimisation for registration with little learningconvexadam

Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network)

Decoding the Protein-ligand Interactions Using Parallel Graph Neural Networks

Python tools for 3D face: 3DMM, Mesh processing(transform, camera, light, render), 3D face representations.

Emulation and Feedback Fuzzing of Firmware with Memory Sanitization

Generative Adversarial Text-to-Image Synthesis

This repository contains the source code of Auto-Lambda and baselines from the paper, Auto-Lambda: Disentangling Dynamic Task Relationships.

Visualizer using audio and semantic analysis to explore BigGAN (Brock et al., 2018) latent space.

ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection

Tutoriais publicados nas nossas redes sociais para obtenção de dados, análises simples e outras tarefas relevantes no mercado financeiro.

Deploy pytorch classification model using Flask and Streamlit

Forecasting directional movements of stock prices for intraday trading using LSTM and random forest

A demo of how to use JAX to create a simple gravity simulation

OoD Minimum Anomaly Score GAN - Code for the Paper 'OMASGAN: Out-of-Distribution Minimum Anomaly Score GAN for Sample Generation on the Boundary'

Poisson Surface Reconstruction for LiDAR Odometry and Mapping

PIXIE: Collaborative Regression of Expressive Bodies

AdaFocus (ICCV 2021) Adaptive Focus for Efficient Video Recognition

Official repository of IMPROVING DEEP IMAGE MATTING VIA LOCAL SMOOTHNESS ASSUMPTION.

Efficient face emotion recognition in photos and videos