Human segmentation models, training/inference code, and trained weights, implemented in PyTorch

Overview

Human-Segmentation-PyTorch

Human segmentation models, training/inference code, and trained weights, implemented in PyTorch.

Supported networks

To assess architecture, memory, forward time (in either cpu or gpu), numper of parameters, and number of FLOPs of a network, use this command:

python measure_model.py

Dataset

Portrait Segmentation (Human/Background)

Set

  • Python3.6.x is used in this repository.
  • Clone the repository:
git clone --recursive https://github.com/AntiAegis/Human-Segmentation-PyTorch.git
cd Human-Segmentation-PyTorch
git submodule sync
git submodule update --init --recursive
  • To install required packages, use pip:
workon humanseg
pip install -r requirements.txt
pip install -e models/pytorch-image-models

Training

  • For training a network from scratch, for example DeepLab3+, use this command:
python train.py --config config/config_DeepLab.json --device 0

where config/config_DeepLab.json is the configuration file which contains network, dataloader, optimizer, losses, metrics, and visualization configurations.

  • For resuming training the network from a checkpoint, use this command:
python train.py --config config/config_DeepLab.json --device 0 --resume path_to_checkpoint/model_best.pth
  • One can open tensorboard to monitor the training progress by enabling the visualization mode in the configuration file.

Inference

There are two modes of inference: video and webcam.

python inference_video.py --watch --use_cuda --checkpoint path_to_checkpoint/model_best.pth
python inference_webcam.py --use_cuda --checkpoint path_to_checkpoint/model_best.pth

Benchmark

  • Networks are trained on a combined dataset from the two mentioned datasets above. There are 6627 training and 737 testing images.
  • Input size of model is set to 320.
  • The CPU and GPU time is the averaged inference time of 10 runs (there are also 10 warm-up runs before measuring) with batch size 1.
  • The mIoU is measured on the testing subset (737 images) from the combined dataset.
  • Hardware configuration for benchmarking:
CPU: Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz
GPU: GeForce GTX 1050 Mobile, CUDA 9.0
Model Parameters FLOPs CPU time GPU time mIoU
UNet_MobileNetV2 (alpha=1.0, expansion=6) 4.7M 1.3G 167ms 17ms 91.37%
UNet_ResNet18 16.6M 9.1G 165ms 21ms 90.09%
DeepLab3+_ResNet18 16.6M 9.1G 133ms 28ms 91.21%
BiSeNet_ResNet18 11.9M 4.7G 88ms 10ms 87.02%
PSPNet_ResNet18 12.6M 20.7G 235ms 666ms ---
ICNet_ResNet18 11.6M 2.0G 48ms 55ms 86.27%
Owner
Thuy Ng
Machine Learning, Deep Learning, Computer Vision, Signal Processing
Thuy Ng
Code for paper: Towards Tokenized Human Dynamics Representation

Video Tokneization Codebase for video tokenization, based on our paper Towards Tokenized Human Dynamics Representation. Prerequisites (tested under Py

Kenneth Li 20 May 31, 2022
SeisComP/SeisBench interface to enable deep-learning (re)picking in SeisComP

scdlpicker SeisComP/SeisBench interface to enable deep-learning (re)picking in SeisComP Objective This is a simple deep learning (DL) repicker module

Joachim Saul 6 May 13, 2022
The official implementation of the Interspeech 2021 paper WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution.

WSRGlow The official implementation of the Interspeech 2021 paper WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution. Audio sa

Kexun Zhang 96 Jan 03, 2023
Monitor your ML jobs on mobile devices📱, especially for Google Colab / Kaggle

TF Watcher TF Watcher is a simple to use Python package and web app which allows you to monitor 👀 your Machine Learning training or testing process o

Rishit Dagli 54 Nov 01, 2022
Equivariant Imaging: Learning Beyond the Range Space

[Project] Equivariant Imaging: Learning Beyond the Range Space Project about the

Georges Le Bellier 3 Feb 06, 2022
Revisiting Discriminator in GAN Compression: A Generator-discriminator Cooperative Compression Scheme (NeurIPS2021)

Revisiting Discriminator in GAN Compression: A Generator-discriminator Cooperative Compression Scheme (NeurIPS2021) Overview Prerequisites Linux Pytho

Shaojie Li 34 Mar 31, 2022
This is the official implementation of 3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object Detection, built on SECOND.

3D-CVF This is the official implementation of 3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object

YecheolKim 97 Dec 20, 2022
An open-source outlier detection package by Getcontact Data Team

pyfbad The pyfbad library supports anomaly detection projects. An end-to-end anomaly detection application can be written using the source codes of th

Teknasyon Tech 41 Dec 27, 2022
A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains (IJCV submission)

wsss-analysis The code of: A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains, arXiv pre-print 2019 paper.

Lyndon Chan 48 Dec 18, 2022
WormMovementSimulation - 3D Simulation of Worm Body Movement with Neurons attached to its body

Generate 3D Locomotion Data This module is intended to create 2D video trajector

1 Aug 09, 2022
AI that generate music

PianoGPT ai that generate music try it here https://share.streamlit.io/annasajkh/pianogpt/main/main.py or here https://huggingface.co/spaces/Annas/Pia

Annas 28 Nov 27, 2022
BMN: Boundary-Matching Network

BMN: Boundary-Matching Network A pytorch-version implementation codes of paper: "BMN: Boundary-Matching Network for Temporal Action Proposal Generatio

qinxin 260 Dec 06, 2022
YOLOX-RMPOLY

本算法为适应robomaster比赛,而改动自矩形识别的yolox算法。 基于旷视科技YOLOX,实现对不规则四边形的目标检测 TODO 修改onnx推理模型 更改/添加标注: 1.yolox/models/yolox_polyhead.py: 1.1继承yolox/models/yolo_

3 Feb 25, 2022
InterfaceGAN++: Exploring the limits of InterfaceGAN

InterfaceGAN++: Exploring the limits of InterfaceGAN Authors: Apavou Clément & Belkada Younes From left to right - Images generated using styleGAN and

Younes Belkada 42 Dec 23, 2022
A working implementation of the Categorical DQN (Distributional RL).

Categorical DQN. Implementation of the Categorical DQN as described in A distributional Perspective on Reinforcement Learning. Thanks to @tudor-berari

Florin Gogianu 98 Sep 20, 2022
Simple Dynamic Batching Inference

Simple Dynamic Batching Inference 解决了什么问题? 众所周知,Batch对于GPU上深度学习模型的运行效率影响很大。。。 是在Inference时。搜索、推荐等场景自带比较大的batch,问题不大。但更多场景面临的往往是稀碎的请求(比如图片服务里一次一张图)。 如果

116 Jan 01, 2023
Image augmentation library in Python for machine learning.

Augmentor is an image augmentation library in Python for machine learning. It aims to be a standalone library that is platform and framework independe

Marcus D. Bloice 4.8k Jan 07, 2023
Myia prototyping

Myia Myia is a new differentiable programming language. It aims to support large scale high performance computations (e.g. linear algebra) and their g

Mila 456 Nov 07, 2022
FedGS: A Federated Group Synchronization Framework Implemented by LEAF-MX.

FedGS: Data Heterogeneity-Robust Federated Learning via Group Client Selection in Industrial IoT Preparation For instructions on generating data, plea

Lizonghang 9 Dec 22, 2022
以孤立语假设和宽度优先搜索为基础,构建了一种多通道堆叠注意力Transformer结构的斗地主ai

ddz-ai 介绍 斗地主是一种扑克游戏。游戏最少由3个玩家进行,用一副54张牌(连鬼牌),其中一方为地主,其余两家为另一方,双方对战,先出完牌的一方获胜。 ddz-ai以孤立语假设和宽度优先搜索为基础,构建了一种多通道堆叠注意力Transformer结构的系统,使其经过大量训练后,能在实际游戏中获

freefuiiismyname 88 May 15, 2022