Contextual speed detection for python

Last update: Dec 16, 2021

Overview

Speed Prediction using Optical Flow and 2D CNN

About the challenge:

Comma.AI Speed Challenge This challenge was developed by Comma.AI to predict the speed of a car from a video.

Pipeline

Tensorflow Version: 2.2.0

Steps for implementing speed estimation:

Save the images from the train.mp4 and test.mp4 video using DatasetConverter.py.
Convert the images from the videos, computer dense optical flow on the image sequence and save optical flow images using VideoToOpticalFlowImage.py.
Train the network below on optical flow images and save the best performing model using custom callback.
Use the saved model on the testing dataset using UseModel.py.

Optical Flow

Optical flow is computed on two adjacent image frames in a video, converted it to grayscal and applying cv2.calcOpticalFlowFarneback() which outputs two matrices of same shape as compared to the input shape. Each pixel of the output images denotes the change in its position and speed respectively with respect to the previous image frame. For visualization and training, the output images are combined into single HSV color channel based image.

Data Augmentation

Every single images is flipped horizontally having the target value same as the images from which it is derived. This data augmentation played significant role in reducing validation loss.

Model

The following model is a 2D CNN based model made to be used on optical flow images. As compared to a 3D CNN based model trained on images from video, using optical flow with 2D CNN is faster to train and has lower MSE loss.

Training:

Trained the 2D CNN for 150 epochs to get a validation MSE loss of 0.18 and training MSE loss of 0.05

Output:

This gif below has the prediction vs ground truth for the images on which the model is trained:

This gif is the prediction on the test images:

Learning:

Image augmentation significantly improves the speed estimation of the model
Writing custom data generators for reading batches of images and ground truth
2D CNN with optical flow performs better than 3D CNN in terms of training time and accuracy

Contextual speed detection for python

Related tags

Overview

Speed Prediction using Optical Flow and 2D CNN

About the challenge:

Pipeline

Optical Flow

Data Augmentation

Model

Training:

Output:

Learning:

Reference:

Owner

Mahimana Bhatt

Face Anonymizer - FaceAnonApp v1.0

An unofficial package help developers to implement ZATCA (Fatoora) QR code easily which required for e-invoicing

A facial recognition device is a device that takes an image or a video of a human face and compares it to another image faces in a database.

CUTIE (TensorFlow implementation of Convolutional Universal Text Information Extractor)

Zoom , GoogleMeets에서 Vtuber 데뷔하기

一款基于Qt与OpenCV的仿真数字示波器

A curated list of awesome synthetic data for text location and recognition

A python screen recorder for low-end computers, provides high quality video output.

A bot that plays TFT using OCR. Keeps track of bench, board, items, and plays the user defined team comp.

Reference Code for AAAI-20 paper "Multi-Stage Self-Supervised Learning for Graph Convolutional Networks on Graphs with Few Labels"

fishington.io bot with OpenCV and NumPy

Use Youdao OCR API to covert your clipboard image to text.

A facial recognition program that plays a alarm (mp3 file) when a person i seen in the room. A basic theif using Python and OpenCV

✌️Using this you can control your PC/Laptop volume by Hand Gestures created with Python.

Fusion 360 Add-in that creates a pair of toothed curves that can be used to split a body and create two pieces that slide and lock together.

Here use convulation with sobel filter from scratch in opencv python .

With the virtual keyboard, you can write on the real time images by combining the thumb and index fingers on the letter you want.

This is a project to detect gestures to zoom in or out, using the real-time distance between the index finger and the thumb. It's based on OpenCV and Mediapipe.

Code for CVPR'2022 paper ✨ "Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model"

SRA's seminar on Introduction to Computer Vision Fundamentals