Nvidia Semantic Segmentation monorepo

Overview

Paper | YouTube | Cityscapes Score

Pytorch implementation of our paper Hierarchical Multi-Scale Attention for Semantic Segmentation.

Please refer to the sdcnet branch if you are looking for the code corresponding to Improving Semantic Segmentation via Video Prediction and Label Relaxation.

Installation

  • The code is tested with pytorch 1.3 and python 3.6
  • You can use ./Dockerfile to build an image.

Download Weights

  • Create a directory where you can keep large files. Ideally, not in this directory.
  > mkdir <large_asset_dir>
  • Update __C.ASSETS_PATH in config.py to point at that directory

    __C.ASSETS_PATH=<large_asset_dir>

  • Download pretrained weights from google drive and put into <large_asset_dir>/seg_weights

Download/Prepare Data

If using Cityscapes, download Cityscapes data, then update config.py to set the path:

__C.DATASET.CITYSCAPES_DIR=<path_to_cityscapes>

If using Cityscapes Autolabelled Images, download Cityscapes data, then update config.py to set the path:

__C.DATASET.CITYSCAPES_CUSTOMCOARSE=<path_to_cityscapes>

If using Mapillary, download Mapillary data, then update config.py to set the path:

__C.DATASET.MAPILLARY_DIR=<path_to_mapillary>

Running the code

The instructions below make use of a tool called runx, which we find useful to help automate experiment running and summarization. For more information about this tool, please see runx. In general, you can either use the runx-style commandlines shown below. Or you can call python train.py <args ...> directly if you like.

Run inference on Cityscapes

Dry run:

> python -m runx.runx scripts/eval_cityscapes.yml -i -n

This will just print out the command but not run. It's a good way to inspect the commandline.

Real run:

> python -m runx.runx scripts/eval_cityscapes.yml -i

The reported IOU should be 86.92. This evaluates with scales of 0.5, 1.0. and 2.0. You will find evaluation results in ./logs/eval_cityscapes/...

Run inference on Mapillary

> python -m runx.runx scripts/eval_mapillary.yml -i

The reported IOU should be 61.05. Note that this must be run on a 32GB node and the use of 'O3' mode for amp is critical in order to avoid GPU out of memory. Results in logs/eval_mapillary/...

Dump images for Cityscapes

> python -m runx.runx scripts/dump_cityscapes.yml -i

This will dump network output and composited images from running evaluation with the Cityscapes validation set.

Run inference and dump images on a folder of images

> python -m runx.runx scripts/dump_folder.yml -i

You should end up seeing images that look like the following:

alt text

Train a model

Train cityscapes, using HRNet + OCR + multi-scale attention with fine data and mapillary-pretrained model

> python -m runx.runx scripts/train_cityscapes.yml -i

The first time this command is run, a centroid file has to be built for the dataset. It'll take about 10 minutes. The centroid file is used during training to know how to sample from the dataset in a class-uniform way.

This training run should deliver a model that achieves 84.7 IOU.

Train SOTA default train-val split

> python -m runx.runx  scripts/train_cityscapes_sota.yml -i

Again, use -n to do a dry run and just print out the command. This should result in a model with 86.8 IOU. If you run out of memory, try to lower the crop size or turn off rmi_loss.

Owner
NVIDIA Corporation
NVIDIA Corporation
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" on Semantic Segmentation.

Swin Transformer for Semantic Segmentation of satellite images This repo contains the supported code and configuration files to reproduce semantic seg

23 Oct 10, 2022
On Evaluation Metrics for Graph Generative Models

On Evaluation Metrics for Graph Generative Models Authors: Rylee Thompson, Boris Knyazev, Elahe Ghalebi, Jungtaek Kim, Graham Taylor This is the offic

13 Jan 07, 2023
Invert and perturb GAN images for test-time ensembling

GAN Ensembling Project Page | Paper | Bibtex Ensembling with Deep Generative Views. Lucy Chai, Jun-Yan Zhu, Eli Shechtman, Phillip Isola, Richard Zhan

Lucy Chai 93 Dec 08, 2022
"3D Human Texture Estimation from a Single Image with Transformers", ICCV 2021

Texformer: 3D Human Texture Estimation from a Single Image with Transformers This is the official implementation of "3D Human Texture Estimation from

XiangyuXu 193 Dec 05, 2022
ktrain is a Python library that makes deep learning and AI more accessible and easier to apply

Overview | Tutorials | Examples | Installation | FAQ | How to Cite Welcome to ktrain News and Announcements 2020-11-08: ktrain v0.25.x is released and

Arun S. Maiya 1.1k Jan 02, 2023
Yolov5-opencv-cpp-python - Example of using ultralytics YOLO V5 with OpenCV 4.5.4, C++ and Python

yolov5-opencv-cpp-python Example of performing inference with ultralytics YOLO V

183 Jan 09, 2023
State-of-the-art data augmentation search algorithms in PyTorch

MuarAugment Description MuarAugment is a package providing the easiest way to a state-of-the-art data augmentation pipeline. How to use You can instal

43 Dec 12, 2022
Pytorch implementation of "M-LSD: Towards Light-weight and Real-time Line Segment Detection"

M-LSD: Towards Light-weight and Real-time Line Segment Detection Pytorch implementation of "M-LSD: Towards Light-weight and Real-time Line Segment Det

123 Jan 04, 2023
Code and project page for ICCV 2021 paper "DisUnknown: Distilling Unknown Factors for Disentanglement Learning"

DisUnknown: Distilling Unknown Factors for Disentanglement Learning See introduction on our project page Requirements PyTorch = 1.8.0 torch.linalg.ei

Sitao Xiang 24 May 16, 2022
A Low Complexity Speech Enhancement Framework for Full-Band Audio (48kHz) based on Deep Filtering.

DeepFilterNet A Low Complexity Speech Enhancement Framework for Full-Band Audio (48kHz) based on Deep Filtering. libDF contains Rust code used for dat

Hendrik Schröter 292 Dec 25, 2022
Multi-Glimpse Network With Python

Multi-Glimpse Network Our code requires Python ≥ 3.8 Installation For example, venv + pip: $ python3 -m venv env $ source env/bin/activate (env) $ pyt

9 May 10, 2022
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Apache MXNet (incubating) for Deep Learning Apache MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to m

The Apache Software Foundation 20.2k Jan 05, 2023
"NAS-Bench-301 and the Case for Surrogate Benchmarks for Neural Architecture Search".

NAS-Bench-301 This repository containts code for the paper: "NAS-Bench-301 and the Case for Surrogate Benchmarks for Neural Architecture Search". The

AutoML-Freiburg-Hannover 57 Nov 30, 2022
EDPN: Enhanced Deep Pyramid Network for Blurry Image Restoration

EDPN: Enhanced Deep Pyramid Network for Blurry Image Restoration Ruikang Xu, Zeyu Xiao, Jie Huang, Yueyi Zhang, Zhiwei Xiong. EDPN: Enhanced Deep Pyra

69 Dec 15, 2022
library for nonlinear optimization, wrapping many algorithms for global and local, constrained or unconstrained, optimization

NLopt is a library for nonlinear local and global optimization, for functions with and without gradient information. It is designed as a simple, unifi

Steven G. Johnson 1.4k Dec 25, 2022
Uni-Fold: Training your own deep protein-folding models.

Uni-Fold: Training your own deep protein-folding models. This package provides and implementation of a trainable, Transformer-based deep protein foldi

DeepModeling 88 Jan 03, 2023
[SIGIR22] Official PyTorch implementation for "CORE: Simple and Effective Session-based Recommendation within Consistent Representation Space".

CORE This is the official PyTorch implementation for the paper: Yupeng Hou, Binbin Hu, Zhiqiang Zhang, Wayne Xin Zhao. CORE: Simple and Effective Sess

RUCAIBox 26 Dec 19, 2022
Official implementation of deep-multi-trajectory-based single object tracking (IEEE T-CSVT 2021).

DeepMTA_PyTorch Officical PyTorch Implementation of "Dynamic Attention-guided Multi-TrajectoryAnalysis for Single Object Tracking", Xiao Wang, Zhe Che

Xiao Wang(王逍) 7 Dec 03, 2022
African language Speech Recognition - Speech-to-Text

Swahili-Speech-To-Text Table of Contents Swahili-Speech-To-Text Overview Scenario Approach Project Structure data: models: notebooks: scripts tests: l

2 Jan 05, 2023
Simulator for FRC 2022 challenge: Rapid React

rrsim Simulator for FRC 2022 challenge: Rapid React out-1.mp4 Usage In order to run the simulator use the following: python3 rrsim.py [config_path] wh

1 Jan 18, 2022