Train a state-of-the-art yolov3 object detector from scratch!

Overview

TrainYourOwnYOLO: Building a Custom Object Detector from Scratch License: CC BY 4.0 DOI

This repo let's you train a custom image detector using the state-of-the-art YOLOv3 computer vision algorithm. For a short write up check out this medium post. This repo works with TensorFlow 2.3 and Keras 2.4.

Before getting started:

  • 🍴 fork this repo so that you can use it as part of your own project.
  • ⭐ star this repo to get notifications on future improvements.

Pipeline Overview

To build and test your YOLO object detection algorithm follow the below steps:

  1. Image Annotation
    • Install Microsoft's Visual Object Tagging Tool (VoTT)
    • Annotate images
  2. Training
    • Download pre-trained weights
    • Train your custom YOLO model on annotated images
  3. Inference
    • Detect objects in new images and videos

Repo structure

  • 1_Image_Annotation: Scripts and instructions on annotating images
  • 2_Training: Scripts and instructions on training your YOLOv3 model
  • 3_Inference: Scripts and instructions on testing your trained YOLO model on new images and videos
  • Data: Input Data, Output Data, Model Weights and Results
  • Utils: Utility scripts used by main scripts

Getting Started

Google Colab Tutorial Open In Colab

With Google Colab you can skip most of the set up steps and start training your own model right away.

Requisites

The only hard requirement is a running version of python 3.6 or 3.7. To install python 3.7 go to

and follow the installation instructions. Note that this repo has only been tested with python 3.6 and python 3.7 thus it is recommened to use either python3.6 or python3.7.

To speed up training, it is recommended to use a GPU with CUDA support. For example on AWS you can use a p2.xlarge instance (Tesla K80 GPU with 12GB memory). Inference speed on a typical CPU is approximately ~2 images per second. If you want to use your own machine, follow the instructions at tensorflow.org/install/gpu to install CUDA drivers. Make sure to install the correct version of CUDA and cuDNN.

Installation

Setting up Virtual Environment [Linux or Mac]

Clone this repo with:

git clone https://github.com/AntonMu/TrainYourOwnYOLO
cd TrainYourOwnYOLO/

Create Virtual (Linux/Mac) Environment:

python3 -m venv env
source env/bin/activate

Make sure that, from now on, you run all commands from within your virtual environment.

Setting up Virtual Environment [Windows]

Use the Github Desktop GUI to clone this repo to your local machine. Navigate to the TrainYourOwnYOLO project folder and open a power shell window by pressing Shift + Right Click and selecting Open PowerShell window here in the drop-down menu.

Create Virtual (Windows) Environment:

py -m venv env
.\env\Scripts\activate

PowerShell Make sure that, from now on, you run all commands from within your virtual environment.

Install Required Packages [Windows, Mac or Linux]

Install required packages (from within your virtual environment) via:

pip install -r requirements.txt

If this fails, you may have to upgrade your pip version first with pip install pip --upgrade.

Quick Start (Inference only)

To test the cat face detector on test images located in TrainYourOwnYOLO/Data/Source_Images/Test_Images run the Minimal_Example.py script in the root folder with:

python Minimal_Example.py

The outputs are saved in TrainYourOwnYOLO/Data/Source_Images/Test_Image_Detection_Results. This includes:

  • Cat pictures with bounding boxes around faces with confidence scores and
  • Detection_Results.csv file with file names and locations of bounding boxes.

If you want to detect cat faces in your own pictures, replace the cat images in Data/Source_Images/Test_Images with your own images.

Full Start (Training and Inference)

To train your own custom YOLO object detector please follow the instructions detailed in the three numbered subfolders of this repo:

To make everything run smoothly it is highly recommended to keep the original folder structure of this repo!

Each *.py script has various command line options that help tweak performance and change things such as input and output directories. All scripts are initialized with good default values that help accomplish all tasks as long as the original folder structure is preserved. To learn more about available command line options of a python script <script_name.py> run:

python <script_name.py> -h

NEW: Weights and Biases

TrainYourOwnYOLO supports Weights & Biases to track your experiments online. Sign up at wandb.ai to get an API key and run:

wandb -login <API_KEY>

where <API_KEY> is your Weights & Biases API key.

Multi-Stream-Multi-Model-Multi-GPU

If you want to run multiple streams in parallel, head over to github.com/bertelschmitt/multistreamYOLO. Thanks to @bertelschmitt for putting the work into this.

License

Unless explicitly stated otherwise at the top of a file, all code is licensed under CC BY 4.0. This repo makes use of ilmonteux/logohunter which itself is inspired by qqwweee/keras-yolo3.

Troubleshooting

  1. If you encounter any error, please make sure you follow the instructions exactly (word by word). Once you are familiar with the code, you're welcome to modify it as needed but in order to minimize error, I encourage you to not deviate from the instructions above. If you would like to file an issue, please use the provided template and make sure to fill out all fields.

  2. If you encounter a FileNotFoundError, Module not found or similar error, make sure that you did not change the folder structure. Your directory structure must look exactly like this:

    TrainYourOwnYOLO
    └─── 1_Image_Annotation
    └─── 2_Training
    └─── 3_Inference
    └─── Data
    └─── Utils
    

    If you use a different name such as e.g. TrainYourOwnYOLO-master you will have to specify the correct paths as command line arguments in every function call.

    Don't use spaces in file or folder names, i.e. instead of my folder use my_folder.

  3. If you are a Linux user and having trouble installing *.snap package files try:

    snap installβ€Š--dangerous vott-2.1.0-linux.snap

    See Snap Tutorial for more information.

  4. If you have a newer version of python on your system, make sure that you create your virtual environment with version 3.7. You can use virtualenv for this:

    pip install virtualenv
    virtualenv env --python=python3.7
    

    Then follow the same steps as above.

Need more help? File an Issue!

If you would like to file an issue, please use the provided issue template and make sure to complete all fields. This makes it easier to reproduce the issue for someone trying to help you.

Issue

Issues without a completed issue template will be closed and marked with the label "issue template not completed".

Stay Up-to-Date

  • ⭐ star this repo to get notifications on future improvements and
  • 🍴 fork this repo if you like to use it as part of your own project.

CatVideo

Licensing

This work is licensed under a Creative Commons Attribution 4.0 International License. This means that you are free to:

  • Share β€” copy and redistribute the material in any medium or format
  • Adapt β€” remix, transform, and build upon the material for any purpose, even commercially.

Under the following terms:

  • Attribution

Cite as:

@misc{TrainYourOwnYOLO,
  title = {TrainYourOwnYOLO: Building a Custom Object Detector from Scratch},
  author = {Anton Muehlemann},
  year = {2019},
  url = {https://github.com/AntonMu/TrainYourOwnYOLO},
  doi = {10.5281/zenodo.5112375}
}

If your work doesn't include a citation list, simply link this github repo!

CC BY 4.0

Comments
  • Validation loss is nan

    Validation loss is nan

    @AntonMu I am getting nan values for val_loss I have changed anchors according to my data set using kmeans.py in cfg file classes=4, filters=27, width=height=608 in yolo_train.py, i have changed batchsize=1, earlystopping(patience=20), input_shape=608,608, model.compile(optimizer=Adam(lr=1e-10)

    in cfg file i have changed last yolo layers mask values from 0,1,2 to 1,2,3.

    before changing those parameters also i got val_loss as nan and even after changing alsi i am getting val_loss as nan pls help me out with this issue 80602383-ae060880-8a4c-11ea-92db-f748d49c8285

    opened by allipilli-harshitha 26
  • Trying to run the Train_Yolo throws error

    Trying to run the Train_Yolo throws error

    np_resource = np.dtype([("resource", np.ubyte, 1)]) Using TensorFlow backend. Traceback (most recent call last): File "C:/Users/Shreeni/Downloads/ripo/TrainYourOwnYOLO-master/2_Training/Train_YOLO.py", line 32, in from keras_yolo3.yolo3.model import preprocess_true_boxes, yolo_body, tiny_yolo_body, yolo_loss ModuleNotFoundError: No module named 'keras_yolo3'

    help wanted 
    opened by ShashiAdhikari 21
  • Error occurred when finalizing GeneratorDataset iterator

    Error occurred when finalizing GeneratorDataset iterator

    While putting the last touches on the multi-stream-multi-model-multi-GPU YOLO (out any minute) I notice that training aborts with a "W tensorflow/core/kernels/data/generator_dataset_op.cc:103] Error occurred when finalizing GeneratorDataset iterator: Failed precondition: Python interpreter state is not initialized. The process may be terminated." This happens with the factory cat training set on a freshly cloned and untouched TrainYourOwnYOLO.

    The model is usable, however it stops at a loss of 15, which is a bit high:

    Epoch 81/102
    22/22 [==============================] - 5s 206ms/step - loss: 15.3284 - val_loss: 13.8459
    Epoch 82/102
    22/22 [==============================] - 5s 208ms/step - loss: 14.5368 - val_loss: 14.2095
    Epoch 83/102
    22/22 [==============================] - ETA: 0s - loss: 14.7401
    Epoch 00083: ReduceLROnPlateau reducing learning rate to 9.999999939225292e-10.
    22/22 [==============================] - 5s 207ms/step - loss: 14.7401 - val_loss: 13.9430
    Epoch 84/102
    22/22 [==============================] - 5s 208ms/step - loss: 14.1699 - val_loss: 15.3087
    Epoch 00084: early stopping
    2020-10-31 05:50:12.560589: W tensorflow/core/kernels/data/generator_dataset_op.cc:103] Error occurred when finalizing GeneratorDataset iterator: Failed precondition: Python interpreter state is not initialized. The process may be terminated.
    
    

    I can reproduce this consistently on two separate Ubuntu machines, both with large amounts of memory (64 Gbyte and 256 Gbytes. All libraries stock as per prerequisites. CUDA 10.1, Python3.7, Ubuntu 10. The problem has received mentions elsewhere

    opened by bertelschmitt 15
  • Training Fails

    Training Fails

    I tried to train with the images and annotations provided without changing anything.

    Python: v3.7 Tensorflow: v2 System: PI4 2GB RAM (Linux raspberrypi 4.19.75-v7l+ [Buster])

    And it fails when running Train_TOLO.py It fails after starting Epoch 1.

    File "/home/pi/.virtualenvs/cv/lib/python3.7/site-packages/tensorflow_core/python/eager/execute.py", line 67, in quick_execute six.raise_from(core._status_to_exception(e.code, message), None) File "", line 3, in raise_from tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[32,416,416,32] and type float on /job:localhost/replica:0/task:0/device:CPU:0 by allocator cpu [[node conv2d_1/convolution (defined at /home/pi/.virtualenvs/cv/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py:1751) ]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. [Op:__inference_keras_scratch_graph_20380] Function call stack: keras_scratch_graph

    I can provide more info if needed. Thanks in advance for any insight. I saw else where that it might be due to memory. I will try to figure that out and update if that does happen to be the case.

    help wanted 
    opened by NiklasWilson 15
  • Keras Error

    Keras Error

    @AntonMu

    _, ignore_mask = K.control_flow_ops.while_loop(lambda b,*args: b<m, loop_body, [0, ignore_mask]) AttributeError: module 'keras.backend' has no attribute 'control_flow_ops'

    How do I fix the error?

    opened by forceassaulter 11
  • Use two different weight files for Detection

    Use two different weight files for Detection

    ** System information** What is the top-level directory of the model you are using: TrainYourOwnYOLO/ Have I written custom code (as opposed to using a stock example script provided in the repo): No OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows Ananconda Prompt Python TensorFlow version (use command below): 1.13 CUDA/cuDNN version: 10.0 GPU model and memory: NVIDIA GeForce GTX 1050 Ti and Exact command to reproduce: NA

    Describe the problem/Help

    Hi,

    I have querry regarding weight files. I have multiple weight files which are trained on different classes and object. I want to use these multiple weight files simultaneously to detect the objects in the image. Has anyone tried this? Does it work?

    opened by shahzaibraza37 10
  • Train on Colab GPU

    Train on Colab GPU

    Hello, I'm interested in training on a Google Colab GPU. Getting the code running on Colab is pretty straightforward, but it doesn't actually run on the GPU and is therefore quite slow. I'm not sure how to change this; could you point me in the right direction? Many thanks.

    enhancement 
    opened by spectorp 9
  • VOTT and VOTT-not

    VOTT and VOTT-not

    Hi, @AntonMu. To help people get into the rather steep learning curve at higher speed, I wrote a little document covering tips&tricks of VOTT and VOTT-not. VOTT is at the start of each project, a lot of mistakes can be made, and avoided. How to you want me to go about adding the document? It could be the start of a series of YOLO-howtos, and having successfully fed myself and my family with writing, I am glad to give back to this great project.

    opened by bertelschmitt 9
  • Add to trained custom model without retraining whole set?

    Add to trained custom model without retraining whole set?

    Projects runs well and fast (on GPU). Thank you! How would I ADD images and tags to my trained custom model? I have thousands of images, and I would like to add a few hundred every so often without retraining the whole model (takes hours, even on GPU.) Updates would contain new images with existing tags, and new images with new tags.

    opened by bertelschmitt 9
  • EC2 AWS: Keras: ValueError: Invalid backend.

    EC2 AWS: Keras: ValueError: Invalid backend.

    Before filing a report consider this question:

    Have you followed the instructions exactly (word by word)?

    Once you are familiar with the code, you're welcome to modify it. Please only continue to file a bug report if you encounter an issue with the provided code and after having followed the instructions.

    If you have followed the instructions exactly and would still like to file a bug or make a feature requests please follow the steps below.

    1. It must be a bug, a feature request, or a significant problem with the documentation (for small docs fixes please send a PR instead).
    2. The form below must be filled out.

    System information

    • What is the top-level directory of the model you are using: /home/
    • Have I written custom code (as opposed to using a stock example script provided in the repo):Yes, 2 lines were modify, but those are in the train_YOLO.py, and the error is happening during the weights download process, "Code modified (from PIL import Image, ImageFile, ImageFile.LOAD_TRUNCATED_IMAGES = True)"
    • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Deep Learning AMI (Ubuntu 18.04) Version 26.0
    • TensorFlow version (use command below): tensorflow==1.15.0
    • CUDA/cuDNN version: Built on Sat_Aug_25_21:08:01_CDT_2018 Cuda compilation tools, release 10.0, V10.0.130
    • GPU model and memory: High-performance NVIDIA K80 GPUs, each with 2,496 parallel processing cores and 12GiB of GPU memory
    • Exact command to reproduce: TrainYourOwnYOLO/2_Training$ python Download_and_Convert_YOLO_weights.py You can obtain the TensorFlow version with

    python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)" v1.15.0-rc3-22-g590d6ee 1.15.0

    Describe the problem

    Describe the problem clearly here. Be sure to convey here why it's a bug or a feature request.

    I first tried to run the pre-trained model, and the training locally in Windows with Linux subsystem, and both worked fine! Awesome job, thank you so much for sharing! The problem happened when I tried to implement the YOLO in AWS inside of an EC2 instance. I followed the instructions step by step, but when I got to the point when I have to download the pre-trained model, Keras failed to load the backend.

    Source code / logs

    user:~/YOLOV3/TrainYourOwnYOLO/2_Training$ python Download_and_Convert_YOLO_weights.py

    99% (2477235 of 2480070) |################################ | Elapsed Time: 0:00:30 ETA: 0:00:00Traceback (most recent call last): File "convert.py", line 14, in from keras import backend as K File "/home/ubuntu/YOLOV3/TrainYourOwnYOLO/env/lib/python3.6/site-packages/keras/init.py", line 3, in from . import utils File "/home/ubuntu/YOLOV3/TrainYourOwnYOLO/env/lib/python3.6/site-packages/keras/utils/init.py", line 6, in from . import conv_utils File "/home/ubuntu/YOLOV3/TrainYourOwnYOLO/env/lib/python3.6/site-packages/keras/utils/conv_utils.py", line 9, in from .. import backend as K File "/home/ubuntu/YOLOV3/TrainYourOwnYOLO/env/lib/python3.6/site-packages/keras/backend/init.py", line 1, in from .load_backend import epsilon File "/home/ubuntu/YOLOV3/TrainYourOwnYOLO/env/lib/python3.6/site-packages/keras/backend/load_backend.py", line 101, in raise ValueError('Invalid backend. Missing required entry : ' + e) ValueError: Invalid backend. Missing required entry : placeholder

    opened by silvestre139 9
  • Error while running Train_YOLO.py file

    Error while running Train_YOLO.py file

    Before filing a report consider the following two questions:

    Have you followed all Readme instructions exactly?

    Yes

    Have you checked the troubleshooting section?

    Yes

    Have you looked for similar issues?

    Yes

    System information

    • What is the top-level directory of the model you are using:
    • Have I written custom code (as opposed to using a stock example script provided in the repo):
    • **System: Windows 10
    • TensorFlow version (use command below): 2.3.1
    • CUDA/cuDNN version: 10.1/7.6.3
    • GPU model and memory: MX150
    • Exact command to reproduce:

    You can obtain the TensorFlow version with

    python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"

    Describe the problem

    Sir I am facing the problems while running the Train_YOLO.py file. After installing all the requires models, libraries and setting all things correctly according to provided instructions then also it gives some error after running Trin_Yolo.py file so I can train the model so please give me the solution of this as soon as posible.

    Problem

    2021-01-19 19:25:07.843470: I tensorflow/core/common_runtime/bfc_allocator.cc:1046] Stats: Limit: 1408043828 InUse: 1147219200 MaxInUse: 1147219456 NumAllocs: 2137 MaxAllocSize: 708837376 Reserved: 0 PeakReserved: 0 LargestFreeBlock: 0

    2021-01-19 19:25:07.845884: W tensorflow/core/common_runtime/bfc_allocator.cc:439] ****************************_____***********************************************____________ 2021-01-19 19:25:08.113307: W tensorflow/core/framework/op_kernel.cc:1767] OP_REQUIRES failed at conv_ops.cc:947 : Resource exhausted: OOM when allocating tensor with shape[32,32,416,416] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc Traceback (most recent call last): File "Train_YOLO.py", line 267, in callbacks=frozen_callbacks, File "C:\Users\Akshat\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\util\deprecation.py", line 324, in new_func return func(*args, **kwargs) File "C:\Users\Akshat\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\keras\engine\training.py", line 1829, in fit_generator initial_epoch=initial_epoch) File "C:\Users\Akshat\AppData\Local\Programs\Python\Python36\lib\site-packages\wandb\integration\keras\keras.py", line 120, in new_v2 return old_v2(*args, **kwargs) File "C:\Users\Akshat\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\keras\engine\training.py", line 108, in _method_wrapper return method(self, *args, **kwargs) File "C:\Users\Akshat\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\keras\engine\training.py", line 1098, in fit tmp_logs = train_function(iterator) File "C:\Users\Akshat\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\eager\def_function.py", line 780, in call result = self._call(*args, **kwds) File "C:\Users\Akshat\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\eager\def_function.py", line 840, in _call return self._stateless_fn(*args, **kwds) File "C:\Users\Akshat\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\eager\function.py", line 2829, in call return graph_function._filtered_call(args, kwargs) # pylint: disable=protected-access File "C:\Users\Akshat\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\eager\function.py", line 1848, in _filtered_call cancellation_manager=cancellation_manager) File "C:\Users\Akshat\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\eager\function.py", line 1924, in _call_flat ctx, args, cancellation_manager=cancellation_manager)) File "C:\Users\Akshat\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\eager\function.py", line 550, in call ctx=ctx) File "C:\Users\Akshat\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\eager\execute.py", line 60, in quick_execute inputs, attrs, num_outputs) tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[32,32,416,416] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[node functional_5/conv2d/Conv2D (defined at C:\Users\Akshat\AppData\Local\Programs\Python\Python36\lib\site-packages\wandb\integration\keras\keras.py:120) ]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. [Op:__inference_train_function_22034]

    Function call stack: train_function

    2021-01-19 19:25:16.373648: W tensorflow/core/kernels/data/generator_dataset_op.cc:103] Error occurred when finalizing GeneratorDataset iterator: Failed precondition: Python interpreter state is not initialized. The process may be terminated. [[{{node PyFunc}}]]

    opened by ghost 8
  • Multiple Labels in an image

    Multiple Labels in an image

    We are using this repo for a big project in university. We have succesfully implemented the model however it is handicapped by the fact that it can only have one label per image. In our setting we have multiple objects. We recon that making it able train on multiple object would greatly increase its performance. So the questions is: If you have multiple objects in an image, how can this be implemented in the traning using our current setup?

    Best regards, Christain

    opened by ChrisRawstone 1
  • Models,anchors,classes loading time - for optimization

    Models,anchors,classes loading time - for optimization

    Hi @AntonMu ,

    Hope you are doing good.

    First of all thank you for this. I have a question, every time when i predict on an image, I get this "models, anchors and classes loaded in 7.95 seconds"

    So for optimization purposes, would like to know if there is any way this can be loaded initially (say at first or when i start for the first time) and for the subsequent calls, this 7 - 8 seconds can be saved.

    Thanks in advance!

    opened by vivektop 1
  • validation set and loss in tensorboard graph

    validation set and loss in tensorboard graph

    Hi @AntonMu i would like to ask you where i can find the graph related to the validation set? When i use the tensorboard only appears the loss of the trainning.

    Another question is related to the loss graph from tensorboard. Can you explain to my why the loss has a such a steep decline between the first trainning epochs and the second ones?

    I have another question, why do you freeze some layers in the first set of trainninf and then unfreeze them all in the last part? And how many layers do you freeze initially?

    image

    opened by joaoalves10 4
  • What's the purpose of pre-trained weights in YOLO?

    What's the purpose of pre-trained weights in YOLO?

    "Before getting started download the pre-trained YOLOv3 weights and convert them to the keras format", I want to understand why do we need to use pre-trained weights in yolo.

    question 
    opened by anjanaouseph 17
  • Train and run interference at the same time on the same machine

    Train and run interference at the same time on the same machine

    Do you want to keep your interference going while those long training jobs are running? The multi-stream-multi-model-multi-GPU version of TrainYourOwnYOLO (now available here) lets you do just that. If you only have one GPU, limit the memory used by your interference streams so that Train_YOLO.py has enough GPU RAM to work with (experiment!). Training will commence at reduced speed. If you have two GPUs in your machine, move the interference jobs to the 2nd GPU (run_on_gpu: 1 in MultiDetect.conf). Training will grab all memory on GPU #0 and run at full speed, while interference runs at full speed on GPU #1. Training doesn’t seem to be smart enough to grab GPU #1 when its available, and when GPU #0 is busy.

    enhancement wontfix 
    opened by bertelschmitt 2
Releases(v0.2.3)
Owner
AntonMu
University of Oxford, UC Berkeley
AntonMu
ICON: Implicit Clothed humans Obtained from Normals

ICON: Implicit Clothed humans Obtained from Normals arXiv, December 2021. Yuliang Xiu Β· Jinlong Yang Β· Dimitrios Tzionas Β· Michael J. Black Table of C

Yuliang Xiu 1.1k Dec 30, 2022
Collects many various multi-modal transformer architectures, including image transformer, video transformer, image-language transformer, video-language transformer and related datasets

The repository collects many various multi-modal transformer architectures, including image transformer, video transformer, image-language transformer, video-language transformer and related datasets

Jun Chen 139 Dec 21, 2022
This repository includes code of my study about Asynchronous in Frequency domain of GAN images.

Exploring the Asynchronous of the Frequency Spectra of GAN-generated Facial Images Binh M. Le & Simon S. Woo, "Exploring the Asynchronous of the Frequ

4 Aug 06, 2022
Official repository for "Action-Based Conversations Dataset: A Corpus for Building More In-Depth Task-Oriented Dialogue Systems"

Action-Based Conversations Dataset (ABCD) This respository contains the code and data for ABCD (Chen et al., 2021) Introduction Whereas existing goal-

ASAPP Research 49 Oct 09, 2022
AI virtual gym is an AI program which can be used to exercise and can be used to see if we are doing the exercises

AI virtual gym is an AI program which can be used to exercise and can be used to see if we are doing the exercises

4 Feb 13, 2022
It's A ML based Web Site build with python and Django to find the breed of the dog

ML-Based-Dog-Breed-Identifier This is a Django Based Web Site To Identify the Breed of which your DOG belogs All You Need To Do is to Follow These Ste

Sanskar Dwivedi 2 Oct 12, 2022
A PyTorch implementation for PyramidNets (Deep Pyramidal Residual Networks)

A PyTorch implementation for PyramidNets (Deep Pyramidal Residual Networks) This repository contains a PyTorch implementation for the paper: Deep Pyra

Greg Dongyoon Han 262 Jan 03, 2023
Code release for NeX: Real-time View Synthesis with Neural Basis Expansion

NeX: Real-time View Synthesis with Neural Basis Expansion Project Page | Video | Paper | COLAB | Shiny Dataset We present NeX, a new approach to novel

536 Dec 20, 2022
HiddenMarkovModel implements hidden Markov models with Gaussian mixtures as distributions on top of TensorFlow

Class HiddenMarkovModel HiddenMarkovModel implements hidden Markov models with Gaussian mixtures as distributions on top of TensorFlow 2.0 Installatio

Susara Thenuwara 2 Nov 03, 2021
Driller: augmenting AFL with symbolic execution!

Driller Driller is an implementation of the driller paper. This implementation was built on top of AFL with angr being used as a symbolic tracer. Dril

Shellphish 791 Jan 06, 2023
Code for ICDM2020 full paper: "Sub-graph Contrast for Scalable Self-Supervised Graph Representation Learning"

Subg-Con Sub-graph Contrast for Scalable Self-Supervised Graph Representation Learning (Jiao et al., ICDM 2020): https://arxiv.org/abs/2009.10273 Over

34 Jul 06, 2022
Interacting Two-Hand 3D Pose and Shape Reconstruction from Single Color Image (ICCV 2021)

Interacting Two-Hand 3D Pose and Shape Reconstruction from Single Color Image Interacting Two-Hand 3D Pose and Shape Reconstruction from Single Color

75 Dec 02, 2022
A robotic arm that mimics hand movement through MediaPipe tracking.

La-Z-Arm A robotic arm that mimics hand movement through MediaPipe tracking. Hardware NVidia Jetson Nano Sparkfun Pi Servo Shield Micro Servos Webcam

Alfred 1 Jun 05, 2022
DanceTrack: Multiple Object Tracking in Uniform Appearance and Diverse Motion

DanceTrack DanceTrack is a benchmark for tracking multiple objects in uniform appearance and diverse motion. DanceTrack provides box and identity anno

260 Dec 28, 2022
Official code for the paper "Self-Supervised Prototypical Transfer Learning for Few-Shot Classification"

Self-Supervised Prototypical Transfer Learning for Few-Shot Classification This repository contains the reference source code and pre-trained models (

EPFL INDY 44 Nov 04, 2022
Official source code of paper 'IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo'

IterMVS official source code of paper 'IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo' Introduction IterMVS is a novel lear

Fangjinhua Wang 127 Jan 04, 2023
Sequence modeling benchmarks and temporal convolutional networks

Sequence Modeling Benchmarks and Temporal Convolutional Networks (TCN) This repository contains the experiments done in the work An Empirical Evaluati

CMU Locus Lab 3.5k Jan 01, 2023
Codes for "Template-free Prompt Tuning for Few-shot NER".

EntLM The source codes for EntLM. Dependencies: Cuda 10.1, python 3.6.5 To install the required packages by following commands: $ pip3 install -r requ

77 Dec 27, 2022
Video Matting Refinement For Python

Video-matting refinement Library (use pip to install) scikit-image numpy av matplotlib Run Static background python path_to_video.mp4 Moving backgroun

3 Jan 11, 2022
An easier way to build neural search on the cloud

An easier way to build neural search on the cloud Jina is a deep learning-powered search framework for building cross-/multi-modal search systems (e.g

Jina AI 17k Jan 02, 2023