A Kitti Road Segmentation model implemented in tensorflow.

Overview

KittiSeg

KittiSeg performs segmentation of roads by utilizing an FCN based model. The model achieved first place on the Kitti Road Detection Benchmark at submission time. Check out our paper for a detailed model description.

The model is designed to perform well on small datasets. The training is done using just 250 densely labelled images. Despite this a state-of-the art MaxF1 score of over 96% is achieved. The model is usable for real-time application. Inference can be performed at the impressive speed of 95ms per image.

The repository contains code for training, evaluating and visualizing semantic segmentation in TensorFlow. It is build to be compatible with the TensorVision back end which allows to organize experiments in a very clean way. Also check out KittiBox a similar projects to perform state-of-the art detection. And finally the MultiNet repository contains code to jointly train segmentation, classification and detection. KittiSeg and KittiBox are utilized as submodules in MultiNet.

Requirements

The code requires Tensorflow 1.0, python 2.7 as well as the following python libraries:

  • matplotlib
  • numpy
  • Pillow
  • scipy
  • commentjson

Those modules can be installed using: pip install numpy scipy pillow matplotlib commentjson or pip install -r requirements.txt.

Setup

  1. Clone this repository: git clone https://github.com/MarvinTeichmann/KittiSeg.git
  2. Initialize all submodules: git submodule update --init --recursive
  3. [Optional] Download Kitti Road Data:
    1. Retrieve kitti data url here: http://www.cvlibs.net/download.php?file=data_road.zip
    2. Call python download_data.py --kitti_url URL_YOU_RETRIEVED

Running the model using demo.py does not require you to download kitti data (step 3). Step 3 is only required if you want to train your own model using train.py or bench a model agains the official evaluation score evaluate.py. Also note, that I recommend using download_data.py instead of downloading the data yourself. The script will also extract and prepare the data. See Section Manage data storage if you like to control where the data is stored.

To update an existing installation do:
  1. Pull all patches: git pull
  2. Update all submodules: git submodule update --init --recursive

If you forget the second step you might end up with an inconstant repository state. You will already have the new code for KittiSeg but run it old submodule versions code. This can work, but I do not run any tests to verify this.

Tutorial

Getting started

Run: python demo.py --input_image data/demo/demo.png to obtain a prediction using demo.png as input.

Run: python evaluate.py to evaluate a trained model.

Run: python train.py --hypes hypes/KittiSeg.json to train a model using Kitti Data.

If you like to understand the code, I would recommend looking at demo.py first. I have documented each step as thoroughly as possible in this file.

Manage Data Storage

KittiSeg allows to separate data storage from code. This is very useful in many server environments. By default, the data is stored in the folder KittiSeg/DATA and the output of runs in KittiSeg/RUNS. This behaviour can be changed by setting the bash environment variables: $TV_DIR_DATA and $TV_DIR_RUNS.

Include export TV_DIR_DATA="/MY/LARGE/HDD/DATA" in your .profile and the all data will be downloaded to /MY/LARGE/HDD/DATA/data_road. Include export TV_DIR_RUNS="/MY/LARGE/HDD/RUNS" in your .profile and all runs will be saved to /MY/LARGE/HDD/RUNS/KittiSeg

RUNDIR and Experiment Organization

KittiSeg helps you to organize large number of experiments. To do so the output of each run is stored in its own rundir. Each rundir contains:

  • output.log a copy of the training output which was printed to your screen
  • tensorflow events tensorboard can be run in rundir
  • tensorflow checkpoints the trained model can be loaded from rundir
  • [dir] images a folder containing example output images. image_iter controls how often the whole validation set is dumped
  • [dir] model_files A copy of all source code need to build the model. This can be very useful of you have many versions of the model.

To keep track of all the experiments, you can give each rundir a unique name with the --name flag. The --project flag will store the run in a separate subfolder allowing to run different series of experiments. As an example, python train.py --project batch_size_bench --name size_5 will use the following dir as rundir: $TV_DIR_RUNS/KittiSeg/batch_size_bench/size_5_KittiSeg_2017_02_08_13.12.

The flag --nosave is very useful to not spam your rundir.

Modifying Model & Train on your own data

The model is controlled by the file hypes/KittiSeg.json. Modifying this file should be enough to train the model on your own data and adjust the architecture according to your needs. A description of the expected input format can be found here.

For advanced modifications, the code is controlled by 5 different modules, which are specified in hypes/KittiSeg.json.

"model": {
   "input_file": "../inputs/kitti_seg_input.py",
   "architecture_file" : "../encoder/fcn8_vgg.py",
   "objective_file" : "../decoder/kitti_multiloss.py",
   "optimizer_file" : "../optimizer/generic_optimizer.py",
   "evaluator_file" : "../evals/kitti_eval.py"
},

Those modules operate independently. This allows easy experiments with different datasets (input_file), encoder networks (architecture_file), etc. Also see TensorVision for a specification of each of those files.

Utilize TensorVision backend

KittiSeg is build on top of the TensorVision TensorVision backend. TensorVision modularizes computer vision training and helps organizing experiments.

To utilize the entire TensorVision functionality install it using

$ cd KittiSeg/submodules/TensorVision
$ python setup.py install

Now you can use the TensorVision command line tools, which includes:

tv-train --hypes hypes/KittiSeg.json trains a json model.
tv-continue --logdir PATH/TO/RUNDIR trains the model in RUNDIR, starting from the last saved checkpoint. Can be used for fine tuning by increasing max_steps in model_files/hypes.json .
tv-analyze --logdir PATH/TO/RUNDIR evaluates the model in RUNDIR

Useful Flags & Variabels

Here are some Flags which will be useful when working with KittiSeg and TensorVision. All flags are available across all scripts.

--hypes : specify which hype-file to use
--logdir : specify which logdir to use
--gpus : specify on which GPUs to run the code
--name : assign a name to the run
--project : assign a project to the run
--nosave : debug run, logdir will be set to debug

In addition the following TensorVision environment Variables will be useful:

$TV_DIR_DATA: specify meta directory for data
$TV_DIR_RUNS: specify meta directory for output
$TV_USE_GPUS: specify default GPU behaviour.

On a cluster it is useful to set $TV_USE_GPUS=force. This will make the flag --gpus mandatory and ensure, that run will be executed on the right GPU.

Questions?

Please have a look into the FAQ. Also feel free to open an issue to discuss any questions not covered so far.

Citation

If you benefit from this code, please cite our paper:

@article{teichmann2016multinet,
  title={MultiNet: Real-time Joint Semantic Reasoning for Autonomous Driving},
  author={Teichmann, Marvin and Weber, Michael and Zoellner, Marius and Cipolla, Roberto and Urtasun, Raquel},
  journal={arXiv preprint arXiv:1612.07695},
  year={2016}
}
Owner
Marvin Teichmann
Germany Phd student. Working on Deep Learning and Computer Vision projects.
Marvin Teichmann
GalaXC: Graph Neural Networks with Labelwise Attention for Extreme Classification

GalaXC GalaXC: Graph Neural Networks with Labelwise Attention for Extreme Classification @InProceedings{Saini21, author = {Saini, D. and Jain,

Extreme Classification 28 Dec 05, 2022
MOOSE (Multi-organ objective segmentation) a data-centric AI solution that generates multilabel organ segmentations to facilitate systemic TB whole-person research

MOOSE (Multi-organ objective segmentation) a data-centric AI solution that generates multilabel organ segmentations to facilitate systemic TB whole-person research.The pipeline is based on nn-UNet an

QIMP team 30 Jan 01, 2023
A proof of concept ai-powered Recaptcha v2 solver

Recaptcha Fullauto I've decided to open source my old Recaptcha v2 solver. My latest version will be opened sourced this summer. I am hoping this proj

Nate 60 Dec 20, 2022
EDPN: Enhanced Deep Pyramid Network for Blurry Image Restoration

EDPN: Enhanced Deep Pyramid Network for Blurry Image Restoration Ruikang Xu, Zeyu Xiao, Jie Huang, Yueyi Zhang, Zhiwei Xiong. EDPN: Enhanced Deep Pyra

69 Dec 15, 2022
Source code of our BMVC 2021 paper: AniFormer: Data-driven 3D Animation with Transformer

AniFormer This is the PyTorch implementation of our BMVC 2021 paper AniFormer: Data-driven 3D Animation with Transformer. Haoyu Chen, Hao Tang, Nicu S

24 Nov 02, 2022
Reproduced Code for Image Forgery Detection papers.

Image Forgery Detection With over 4.5 billion active internet users, the amount of multimedia content being shared every day has surpassed everyone’s

Umar Masud 15 Dec 06, 2022
Torchserve server using a YoloV5 model running on docker with GPU and static batch inference to perform production ready inference.

Yolov5 running on TorchServe (GPU compatible) ! This is a dockerfile to run TorchServe for Yolo v5 object detection model. (TorchServe (PyTorch librar

82 Nov 29, 2022
ShapeGlot: Learning Language for Shape Differentiation

ShapeGlot: Learning Language for Shape Differentiation Created by Panos Achlioptas, Judy Fan, Robert X.D. Hawkins, Noah D. Goodman, Leonidas J. Guibas

Panos 32 Dec 23, 2022
RipsNet: a general architecture for fast and robust estimation of the persistent homology of point clouds

RipsNet: a general architecture for fast and robust estimation of the persistent homology of point clouds This repository contains the code asscoiated

Felix Hensel 14 Dec 12, 2022
以孤立语假设和宽度优先搜索为基础,构建了一种多通道堆叠注意力Transformer结构的斗地主ai

ddz-ai 介绍 斗地主是一种扑克游戏。游戏最少由3个玩家进行,用一副54张牌(连鬼牌),其中一方为地主,其余两家为另一方,双方对战,先出完牌的一方获胜。 ddz-ai以孤立语假设和宽度优先搜索为基础,构建了一种多通道堆叠注意力Transformer结构的系统,使其经过大量训练后,能在实际游戏中获

freefuiiismyname 88 May 15, 2022
Toward Realistic Single-View 3D Object Reconstruction with Unsupervised Learning from Multiple Images (ICCV 2021)

Table of Content Introduction Getting Started Datasets Installation Experiments Training & Testing Pretrained models Texture fine-tuning Demo Toward R

VinAI Research 42 Dec 05, 2022
Exemplo de implementação do padrão circuit breaker em python

fast-circuit-breaker Circuit breakers existem para permitir que uma parte do seu sistema falhe sem destruir todo seu ecossistema de serviços. Michael

James G Silva 17 Nov 10, 2022
Repo for CReST: A Class-Rebalancing Self-Training Framework for Imbalanced Semi-Supervised Learning

CReST in Tensorflow 2 Code for the paper: "CReST: A Class-Rebalancing Self-Training Framework for Imbalanced Semi-Supervised Learning" by Chen Wei, Ki

Google Research 75 Nov 01, 2022
Multi-Scale Progressive Fusion Network for Single Image Deraining

Multi-Scale Progressive Fusion Network for Single Image Deraining (MSPFN) This is an implementation of the MSPFN model proposed in the paper (Multi-Sc

Kuijiang 128 Nov 21, 2022
Confident Semantic Ranking Loss for Part Parsing

Confident Semantic Ranking Loss for Part Parsing

Jiachen Xu 5 Oct 22, 2022
Cross-platform CLI tool to generate your Github profile's stats and summary.

ghs Cross-platform CLI tool to generate your Github profile's stats and summary. Preview Hop on to examples for other usecases. Jump to: Installation

HackerRank 134 Dec 20, 2022
Hierarchical Clustering: O(1)-Approximation for Well-Clustered Graphs

Hierarchical Clustering: O(1)-Approximation for Well-Clustered Graphs This repository contains code to accompany the paper "Hierarchical Clustering: O

3 Sep 25, 2022
This repository provides the code for MedViLL(Medical Vision Language Learner).

MedViLL This repository provides the code for MedViLL(Medical Vision Language Learner). Our proposed architecture MedViLL is a single BERT-based model

SuperSuperMoon 39 Jan 05, 2023
Illuminated3D This project participates in the Nasa Space Apps Challenge 2021.

Illuminated3D This project participates in the Nasa Space Apps Challenge 2021.

Eleftheriadis Emmanouil 1 Oct 09, 2021
A framework for joint super-resolution and image synthesis, without requiring real training data

SynthSR This repository contains code to train a Convolutional Neural Network (CNN) for Super-resolution (SR), or joint SR and data synthesis. The met

83 Jan 01, 2023