a Pytorch easy re-implement of "YOLOX: Exceeding YOLO Series in 2021"

Overview

A pytorch easy re-implement of "YOLOX: Exceeding YOLO Series in 2021"

1. Notes

This is a pytorch easy re-implement of "YOLOX: Exceeding YOLO Series in 2021" [https://arxiv.org/abs/2107.08430]
The repo is still under development

2. Environment

pytorch>=1.7.0, python>=3.6, Ubuntu/Windows, see more in 'requirements.txt'

cd /path/to/your/work
git clone https://github.com/zhangming8/yolox-pytorch.git
cd yolox-pytorch
download pre-train weights in Model Zoo to /path/to/your/work/weights

3. Object Detection

Model Zoo

All weights can be downloaded from GoogleDrive or BaiduDrive (code:bc72)

Model test size mAPval
0.5:0.95
mAPtest
0.5:0.95
Params
(M)
yolox-nano 416 25.4 25.7 0.91
yolox-tiny 416 33.1 33.2 5.06
yolox-s 640 39.3 39.6 9.0
yolox-m 640 46.2 46.4 25.3
yolox-l 640 49.5 50.0 54.2
yolox-x 640 50.5 51.1 99.1
yolox-x 800 51.2 51.9 99.1

mAP was reevaluated on COCO val2017 and test2017, and some results are slightly better than the official implement YOLOX. You can reproduce them by scripts in 'evaluate.sh'

Dataset

download COCO:
http://images.cocodataset.org/zips/train2017.zip
http://images.cocodataset.org/zips/val2017.zip
http://images.cocodataset.org/annotations/annotations_trainval2017.zip

unzip and put COCO dataset in following folders:
/path/to/dataset/annotations/instances_train2017.json
/path/to/dataset/annotations/instances_val2017.json
/path/to/dataset/images/train2017/*.jpg
/path/to/dataset/images/val2017/*.jpg

change opt.dataset_path = "/path/to/dataset" in 'config.py'

Train

See more example in 'train.sh'
a. Train from scratch:(backbone="CSPDarknet-s" means using yolox-s, and you can change it, eg: CSPDarknet-nano, tiny, s, m, l, x)
python train.py gpus='0' backbone="CSPDarknet-s" num_epochs=300 exp_id="coco_CSPDarknet-s_640x640" use_amp=True val_intervals=2 data_num_workers=6 batch_size=48

b. Finetune, download pre-trained weight on COCO and finetune on customer dataset:
python train.py gpus='0' backbone="CSPDarknet-s" num_epochs=300 exp_id="coco_CSPDarknet-s_640x640" use_amp=True val_intervals=2 data_num_workers=6 batch_size=48 load_model="../weights/yolox-s.pth"

c. Resume, you can use 'resume=True' when your training is accidentally stopped:
python train.py gpus='0' backbone="CSPDarknet-s" num_epochs=300 exp_id="coco_CSPDarknet-s_640x640" use_amp=True val_intervals=2 data_num_workers=6 batch_size=48 load_model="exp/coco_CSPDarknet-s_640x640/model_last.pth" resume=True

Some Tips:

a. You can also change params in 'train.sh'(these params will replace opt.xxx in config.py) and use 'nohup sh train.sh &' to train
b. Multi-gpu train: set opt.gpus = "3,5,6,7" in 'config.py' or set gpus="3,5,6,7" in 'train.sh'
c. If you want to close multi-size training, change opt.random_size = None in 'config.py' or set random_size=None in 'train.sh'
d. random_size = (14, 26) means: Randomly select an integer from interval (14,26) and multiply by 32 as the input size
e. Visualized log by tensorboard: 
    tensorboard --logdir exp/your_exp_id/logs_2021-08-xx-xx-xx and visit http://localhost:6006
   Your can also use the following shell scripts:
    (1) grep 'train epoch' exp/your_exp_id/logs_2021-08-xx-xx-xx/log.txt
    (2) grep 'val epoch' exp/your_exp_id/logs_2021-08-xx-xx-xx/log.txt

Evaluate

Module weights will be saved in './exp/your_exp_id/model_xx.pth'
change 'load_model'='weight/path/to/evaluate.pth' and backbone='backbone-type' in 'evaluate.sh'
sh evaluate.sh

Predict/Inference/Demo

a. Predict images, change img_dir and load_model
python predict.py gpus='0' backbone="CSPDarknet-s" vis_thresh=0.3 load_model="exp/coco_CSPDarknet-s_640x640/model_best.pth" img_dir='/path/to/dataset/images/val2017'

b. Predict video
python predict.py gpus='0' backbone="CSPDarknet-s" vis_thresh=0.3 load_model="exp/coco_CSPDarknet-s_640x640/model_best.pth" video_dir='/path/to/your/video.mp4'

You can also change params in 'predict.sh', and use 'sh predict.sh'

Train Customer Dataset(VOC format)

1. put your annotations(.xml) and images(.jpg) into:
    /path/to/voc_data/images/train2017/*.jpg  # train images
    /path/to/voc_data/images/train2017/*.xml  # train xml annotations
    /path/to/voc_data/images/val2017/*.jpg  # val images
    /path/to/voc_data/images/val2017/*.xml  # val xml annotations

2. change opt.label_name = ['your', 'dataset', 'label'] in 'config.py'
   change opt.dataset_path = '/path/to/voc_data' in 'config.py'

3. python tools/voc_to_coco.py
   Converted COCO format annotation will be saved into:
    /path/to/voc_data/annotations/instances_train2017.json
    /path/to/voc_data/annotations/instances_val2017.json

4. (Optional) you can visualize the converted annotations by:
    python tools/show_coco_anns.py
    Here is an analysis of the COCO annotation https://blog.csdn.net/u010397980/article/details/90341223?spm=1001.2014.3001.5501

5. run train.sh, evaluate.sh, predict.sh (are the same as COCO)

4. Multi/One-class Multi-object Tracking(MOT)

one-class/single-class MOT Dataset

DOING

Multi-class MOT Dataset

DOING

Train

DOING

Evaluate

DOING

Predict/Inference/Demo

DOING

5. Acknowledgement

https://github.com/Megvii-BaseDetection/YOLOX
https://github.com/PaddlePaddle/PaddleDetection
https://github.com/open-mmlab/mmdetection
https://github.com/xingyizhou/CenterNet
The Official PyTorch Implementation of DiscoBox.

DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision Paper | Project page | Demo (Youtube) | Demo (Bilib

NVIDIA Research Projects 89 Jan 09, 2023
This is an official pytorch implementation of Lite-HRNet: A Lightweight High-Resolution Network.

Lite-HRNet: A Lightweight High-Resolution Network Introduction This is an official pytorch implementation of Lite-HRNet: A Lightweight High-Resolution

HRNet 675 Dec 25, 2022
Repo for the Video Person Clustering dataset, and code for the associated paper

Video Person Clustering Repo for the Video Person Clustering dataset, and code for the associated paper. This reporsitory contains the Video Person Cl

Andrew Brown 47 Nov 02, 2022
U-Net implementation in PyTorch for FLAIR abnormality segmentation in brain MRI

U-Net for brain segmentation U-Net implementation in PyTorch for FLAIR abnormality segmentation in brain MRI based on a deep learning segmentation alg

562 Jan 02, 2023
🛠️ SLAMcore SLAM Utilities

slamcore_utils Description This repo contains the slamcore-setup-dataset script. It can be used for installing a sample dataset for offline testing an

SLAMcore 7 Aug 04, 2022
JAXMAPP: JAX-based Library for Multi-Agent Path Planning in Continuous Spaces

JAXMAPP: JAX-based Library for Multi-Agent Path Planning in Continuous Spaces JAXMAPP is a JAX-based library for multi-agent path planning (MAPP) in c

OMRON SINIC X 24 Dec 28, 2022
AAAI 2022: Stationary diffusion state neural estimation

Stationary Diffusion State Neural Estimation Although many graph-based clustering methods attempt to model the stationary diffusion state in their obj

绽琨 33 Nov 24, 2022
Official PyTorch Implementation of Convolutional Hough Matching Networks, CVPR 2021 (oral)

Convolutional Hough Matching Networks This is the implementation of the paper "Convolutional Hough Matching Network" by J. Min and M. Cho. Implemented

Juhong Min 70 Nov 22, 2022
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

107 Dec 02, 2022
Using machine learning to predict and analyze high and low reader engagement for New York Times articles posted to Facebook.

How The New York Times can increase Engagement on Facebook Using machine learning to understand characteristics of news content that garners "high" Fa

Jessica Miles 0 Sep 16, 2021
A novel benchmark dataset for Monocular Layout prediction

AutoLay AutoLay: Benchmarking Monocular Layout Estimation Kaustubh Mani, N. Sai Shankar, J. Krishna Murthy, and K. Madhava Krishna Abstract In this pa

Kaustubh Mani 39 Apr 26, 2022
Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised Semantic Segmentation (CVPR 2021)

Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised Semantic Segmentation Input Image Initial CAM Successive Maps with adversar

Jungbeom Lee 110 Dec 07, 2022
This repo is the code release of EMNLP 2021 conference paper "Connect-the-Dots: Bridging Semantics between Words and Definitions via Aligning Word Sense Inventories".

Connect-the-Dots: Bridging Semantics between Words and Definitions via Aligning Word Sense Inventories This repo is the code release of EMNLP 2021 con

12 Nov 22, 2022
Hierarchical Few-Shot Generative Models

Hierarchical Few-Shot Generative Models Giorgio Giannone, Ole Winther This repo contains code and experiments for the paper Hierarchical Few-Shot Gene

Giorgio Giannone 6 Dec 12, 2022
A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset.

A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset. This repo contains scripts to train RL agents to navigate the closed world and collect vi

MUGEN 11 Oct 22, 2022
Leveraging OpenAI's Codex to solve cornerstone problems in Music

Music-Codex Leveraging OpenAI's Codex to solve cornerstone problems in Music Please NOTE: Presented generated samples were created by OpenAI's Codex P

Alex 2 Mar 11, 2022
Code for "SRHEN: Stepwise-Refining Homography Estimation Network via Parsing Geometric Correspondences in Deep Latent Space"

SRHEN This is a better and simpler implementation for "SRHEN: Stepwise-Refining Homography Estimation Network via Parsing Geometric Correspondences in

1 Oct 28, 2022
A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.

WebDataset WebDataset is a PyTorch Dataset (IterableDataset) implementation providing efficient access to datasets stored in POSIX tar archives and us

1.1k Jan 08, 2023
Summary of related papers on visual attention

This repo is built for paper: Attention Mechanisms in Computer Vision: A Survey paper Vision-Attention-Papers Channel attention Spatial attention Temp

MenghaoGuo 2.1k Dec 30, 2022
working repo for my xumx-sliCQ submissions to the ISMIR 2021 MDX

Music Demixing Challenge - xumx-sliCQ This repository is the GitHub mirror of my working submission repository for the AICrowd ISMIR 2021 Music Demixi

4 Aug 25, 2021