Minimal PyTorch implementation of YOLOv3

Overview

PyTorch-YOLOv3

A minimal PyTorch implementation of YOLOv3, with support for training, inference and evaluation.

Ubuntu CI PyPI pyversions PyPI license

Installation

Installing from source

For normal training and evaluation we recommend installing the package from source using a poetry virtual enviroment.

git clone https://github.com/eriklindernoren/PyTorch-YOLOv3
cd PyTorch-YOLOv3/
pip3 install poetry --user
poetry install

You need to join the virtual enviroment by runing poetry shell in this directory before running any of the following commands without the poetry run prefix. Also have a look at the other installing method, if you want to use the commands everywhere without opening a poetry-shell.

Download pretrained weights

./weights/download_weights.sh

Download COCO

./data/get_coco_dataset.sh

Install via pip

This installation method is recommended, if you want to use this package as a dependency in another python project. This method only includes the code, is less isolated and may conflict with other packages. Weights and the COCO dataset need to be downloaded as stated above. See API for further information regarding the packages API. It also enables the CLI tools yolo-detect, yolo-train, and yolo-test everywhere without any additional commands.

pip3 install pytorchyolo --user

Test

Evaluates the model on COCO test dataset. To download this dataset as well as weights, see above.

poetry run yolo-test --weights weights/yolov3.weights
Model mAP (min. 50 IoU)
YOLOv3 608 (paper) 57.9
YOLOv3 608 (this impl.) 57.3
YOLOv3 416 (paper) 55.3
YOLOv3 416 (this impl.) 55.5

Inference

Uses pretrained weights to make predictions on images. Below table displays the inference times when using as inputs images scaled to 256x256. The ResNet backbone measurements are taken from the YOLOv3 paper. The Darknet-53 measurement marked shows the inference time of this implementation on my 1080ti card.

Backbone GPU FPS
ResNet-101 Titan X 53
ResNet-152 Titan X 37
Darknet-53 (paper) Titan X 76
Darknet-53 (this impl.) 1080ti 74
poetry run yolo-detect --images data/samples/

Train

For argument descriptions have a lock at poetry run yolo-train --help

Example (COCO)

To train on COCO using a Darknet-53 backend pretrained on ImageNet run:

poetry run yolo-train --data config/coco.data  --pretrained_weights weights/darknet53.conv.74

Tensorboard

Track training progress in Tensorboard:

poetry run tensorboard --logdir='logs' --port=6006

Storing the logs on a slow drive possibly leads to a significant training speed decrease.

You can adjust the log directory using --logdir when running tensorboard and yolo-train.

Train on Custom Dataset

Custom model

Run the commands below to create a custom model definition, replacing with the number of classes in your dataset.

./config/create_custom_model.sh <num-classes>  # Will create custom model 'yolov3-custom.cfg'

Classes

Add class names to data/custom/classes.names. This file should have one row per class name.

Image Folder

Move the images of your dataset to data/custom/images/.

Annotation Folder

Move your annotations to data/custom/labels/. The dataloader expects that the annotation file corresponding to the image data/custom/images/train.jpg has the path data/custom/labels/train.txt. Each row in the annotation file should define one bounding box, using the syntax label_idx x_center y_center width height. The coordinates should be scaled [0, 1], and the label_idx should be zero-indexed and correspond to the row number of the class name in data/custom/classes.names.

Define Train and Validation Sets

In data/custom/train.txt and data/custom/valid.txt, add paths to images that will be used as train and validation data respectively.

Train

To train on the custom dataset run:

poetry run yolo-train --model config/yolov3-custom.cfg --data config/custom.data

Add --pretrained_weights weights/darknet53.conv.74 to train using a backend pretrained on ImageNet.

API

You are able to import the modules of this repo in your own project if you install the pip package pytorchyolo.

An example prediction call from a simple OpenCV python script would look like this:

import cv2
from pytorchyolo import detect, models

# Load the YOLO model
model = models.load_model(
  "/yolov3.cfg", 
  "/yolov3.weights")

# Load the image as an numpy array
img = cv2.imread("")

# Runs the YOLO model on the image 
boxes = detect.detect_image(model, img)

print(boxes)

For more advanced usage look at the method's doc strings.

Credit

YOLOv3: An Incremental Improvement

Joseph Redmon, Ali Farhadi

Abstract
We present some updates to YOLO! We made a bunch of little design changes to make it better. We also trained this new network that’s pretty swell. It’s a little bigger than last time but more accurate. It’s still fast though, don’t worry. At 320 × 320 YOLOv3 runs in 22 ms at 28.2 mAP, as accurate as SSD but three times faster. When we look at the old .5 IOU mAP detection metric YOLOv3 is quite good. It achieves 57.9 AP50 in 51 ms on a Titan X, compared to 57.5 AP50 in 198 ms by RetinaNet, similar performance but 3.8× faster. As always, all the code is online at https://pjreddie.com/yolo/.

[Paper] [Project Webpage] [Authors' Implementation]

@article{yolov3,
  title={YOLOv3: An Incremental Improvement},
  author={Redmon, Joseph and Farhadi, Ali},
  journal = {arXiv},
  year={2018}
}
Owner
Erik Linder-Norén
ML engineer at Apple. Excited about machine learning, basketball and building things.
Erik Linder-Norén
Image data augmentation scheduler for albumentations transforms

albu_scheduler Scheduler for albumentations transforms based on PyTorch schedulers interface Usage TransformMultiStepScheduler import albumentations a

19 Aug 04, 2021
ECCV2020 paper: Fashion Captioning: Towards Generating Accurate Descriptions with Semantic Rewards. Code and Data.

This repo contains some of the codes for the following paper Fashion Captioning: Towards Generating Accurate Descriptions with Semantic Rewards. Code

Xuewen Yang 56 Dec 08, 2022
DIR-GNN - Discovering Invariant Rationales for Graph Neural Networks

DIR-GNN "Discovering Invariant Rationales for Graph Neural Networks" (ICLR 2022)

Ying-Xin (Shirley) Wu 70 Nov 13, 2022
ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information

ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information This repository contains code, model, dataset for ChineseBERT at ACL2021. Ch

413 Dec 01, 2022
Improving Calibration for Long-Tailed Recognition (CVPR2021)

MiSLAS Improving Calibration for Long-Tailed Recognition Authors: Zhisheng Zhong, Jiequan Cui, Shu Liu, Jiaya Jia [arXiv] [slide] [BibTeX] Introductio

DV Lab 116 Dec 20, 2022
A minimal implementation of Gaussian process regression in PyTorch

pytorch-minimal-gaussian-process In search of truth, simplicity is needed. There exist heavy-weighted libraries, but as you know, we need to go bare b

Sangwoong Yoon 38 Nov 25, 2022
Multivariate Boosted TRee

Multivariate Boosted TRee What is MBTR MBTR is a python package for multivariate boosted tree regressors trained in parameter space. The package can h

SUPSI-DACD-ISAAC 61 Dec 19, 2022
Loopy belief propagation for factor graphs on discrete variables, in JAX!

PGMax implements general factor graphs for discrete probabilistic graphical models (PGMs), and hardware-accelerated differentiable loopy belief propagation (LBP) in JAX.

Vicarious 62 Dec 23, 2022
Code and Datasets from the paper "Self-supervised contrastive learning for volcanic unrest detection from InSAR data"

Code and Datasets from the paper "Self-supervised contrastive learning for volcanic unrest detection from InSAR data" You can download the pretrained

Bountos Nikos 3 May 07, 2022
Analyzing basic network responses to novel classes

novelty-detection Analyzing how AlexNet responds to novel classes with varying degrees of similarity to pretrained classes from ImageNet. If you find

Noam Eshed 34 Oct 02, 2022
A simple rest api serving a deep learning model that classifies human gender based on their faces. (vgg16 transfare learning)

this is a simple rest api serving a deep learning model that classifies human gender based on their faces. (vgg16 transfare learning)

crispengari 5 Dec 09, 2021
This is an official implementation of CvT: Introducing Convolutions to Vision Transformers.

Introduction This is an official implementation of CvT: Introducing Convolutions to Vision Transformers. We present a new architecture, named Convolut

Bin Xiao 175 Jan 08, 2023
FishNet: One Stage to Detect, Segmentation and Pose Estimation

FishNet FishNet: One Stage to Detect, Segmentation and Pose Estimation Introduction In this project, we combine target detection, instance segmentatio

1 Oct 05, 2022
Deep Reinforcement Learning by using an on-policy adaptation of Maximum a Posteriori Policy Optimization (MPO)

V-MPO Simple code to demonstrate Deep Reinforcement Learning by using an on-policy adaptation of Maximum a Posteriori Policy Optimization (MPO) in Pyt

Nugroho Dewantoro 9 Jun 06, 2022
the official implementation of the paper "Isometric Multi-Shape Matching" (CVPR 2021)

Isometric Multi-Shape Matching (IsoMuSh) Paper-CVF | Paper-arXiv | Video | Code Citation If you find our work useful in your research, please consider

Maolin Gao 9 Jul 17, 2022
Using pretrained GROVER to extract the atomic fingerprints from molecule

Extracting atomic fingerprints from molecules using pretrained Graph Neural Network models (GROVER).

Xuan Vu Nguyen 1 Jan 28, 2022
A PaddlePaddle version image model zoo.

Paddle-Image-Models English | 简体中文 A PaddlePaddle version image model zoo. Install Package Install by pip: $ pip install ppim Install by wheel package

AgentMaker 131 Dec 07, 2022
Official public repository of paper "Intention Adaptive Graph Neural Network for Category-Aware Session-Based Recommendation"

Intention Adaptive Graph Neural Network (IAGNN) This is the official repository of paper Intention Adaptive Graph Neural Network for Category-Aware Se

9 Nov 22, 2022
Datasets for new state-of-the-art challenge in disentanglement learning

High resolution disentanglement datasets This repository contains the Falcor3D and Isaac3D datasets, which present a state-of-the-art challenge for co

NVIDIA Research Projects 37 May 26, 2022