How to detect objects in real time by using Jupyter Notebook and Neural Networks , by using Yolo3

Overview

Real Time Object Recognition From your Screen Desktop .

In this post, I will explain how to build a simply program to detect objects from you desktop computer.

We will see how using OpenCV and Python, we can detect objects by applying the most popular YOLO(You Look Only Once) algorithm.

OpenCV is the computer vision library/ framework that we we will be using to support our YOLOv3 algorithm

Darknet Architecture is pre-trained model for classifying 80 different classes. Our goal now is that we will use Darknet(YOLOv3) in OpenCV to classify objects using Python language.

For this project we will consider an standard resolution 1920 x 1080 , in windows 10 in Display Setting , select the resolution 1920 x 1080

Then you need to install Anaconda at this link

img

After you install it , check that your terminal , recognize conda

C:\conda --version
conda 4.10.3

The environments supported that I will consider is Python 3.7, Keras 2.4.3 and TensorFlow 2.4.0, let us create the environment, go to you command promt terminal and type the following:

conda create -n detector python==3.7.10
conda activate detector

then in your terminal type the following commands:

conda install ipykernel
Proceed ([y]/n)? y
python -m ipykernel install --user --name detector --display-name "Python (Object Detector)"

Then we install the correct versions of the the Tensorflow, and Numpy and Keras

we create a file called requirements.txt

if your are in Windows

notepad requirements.txt

or Linux

nano  requirements.txt

and you paste the following lines

Keras==2.4.3
keras-resnet==0.2.0
numpy==1.19.3
opencv-python==3.4.2.17
tensorflow==2.4.0
tensorflow-estimator==2.4.0
tensorflow-gpu==2.4.0
Pillow==9.0.0

and then we return back to the terminal and install them

pip install -r requirements.txt

then open the Jupyter notebook with the command

jupyter notebook&

then you click create new notebook Python (Object Detector) and then you can test if you can import the the following libraries

import numpy as np
from PIL import ImageGrab
import cv2
import time
import win32gui, win32ui, win32con, win32api

The next step is is define a function that enable record you screen

def grab_screen(region=None):
    hwin = win32gui.GetDesktopWindow()
    if region:
            left,top,x2,y2 = region
            width = x2 - left + 1
            height = y2 - top + 1
    else:
        width = win32api.GetSystemMetrics(win32con.SM_CXVIRTUALSCREEN)
        height = win32api.GetSystemMetrics(win32con.SM_CYVIRTUALSCREEN)
        left = win32api.GetSystemMetrics(win32con.SM_XVIRTUALSCREEN)
        top = win32api.GetSystemMetrics(win32con.SM_YVIRTUALSCREEN)
    hwindc = win32gui.GetWindowDC(hwin)
    srcdc = win32ui.CreateDCFromHandle(hwindc)
    memdc = srcdc.CreateCompatibleDC()
    bmp = win32ui.CreateBitmap()
    bmp.CreateCompatibleBitmap(srcdc, width, height)
    memdc.SelectObject(bmp)
    memdc.BitBlt((0, 0), (width, height), srcdc, (left, top), win32con.SRCCOPY)
    signedIntsArray = bmp.GetBitmapBits(True)
    img = np.fromstring(signedIntsArray, dtype='uint8')
    img.shape = (height,width,4)
    srcdc.DeleteDC()
    memdc.DeleteDC()
    win32gui.ReleaseDC(hwin, hwindc)
    win32gui.DeleteObject(bmp.GetHandle())
    return cv2.cvtColor(img, cv2.COLOR_BGRA2RGB)

then you define a new function called main() which will record your screen

def main():
    last_time = time.time()
    while True:
        # 1920 windowed mode
        screen = grab_screen(region=(0,40,1920,1120))
        img = cv2.resize(screen,None,fx=0.4,fy=0.3)
        height,width,channels = img.shape
        #detecting objects
        blob = cv2.dnn.blobFromImage(img,0.00392,(416,416),(0,0,0),True,crop=False)
        net.setInput(blob)
        outs = net.forward(outputlayers)
        #Showing info on screen/ get confidence score of algorithm in detecting an object in blob
        class_ids=[]
        confidences=[]
        boxes=[]
        for out in outs:
            for detection in out:
                scores = detection[5:]
                class_id = np.argmax(scores)
                confidence = scores[class_id]
                if confidence > 0.5:
                    #onject detected
                    center_x= int(detection[0]*width)
                    center_y= int(detection[1]*height)
                    w = int(detection[2]*width)
                    h = int(detection[3]*height)
                    #rectangle co-ordinaters
                    x=int(center_x - w/2)
                    y=int(center_y - h/2)
                    boxes.append([x,y,w,h]) #put all rectangle areas
                    confidences.append(float(confidence)) #how confidence was that object detected and show that percentage
                    class_ids.append(class_id) #name of the object tha was detected
        indexes = cv2.dnn.NMSBoxes(boxes,confidences,0.4,0.6)
        font = cv2.FONT_HERSHEY_PLAIN
        for i in range(len(boxes)):
            if i in indexes:
                x,y,w,h = boxes[i]
                label = str(classes[class_ids[i]])
                color = colors[i]
                cv2.rectangle(img,(x,y),(x+w,y+h),color,2)
                cv2.putText(img,label,(x,y+30),font,1,(255,255,255),2)
        #print('Frame took {} seconds'.format(time.time()-last_time))
        last_time = time.time()
        cv2.imshow('window', img)
        if cv2.waitKey(25) & 0xFF == ord('q'):
            cv2.destroyAllWindows()
            break

and finally we download the following files

  1. yolo.cfg (Download from here) — Configuration file
  2. yolo.weights (Download from here) — pre-trained weights
  3. coco.names (Download from here)- 80 classes names

then you add the following code

net = cv2.dnn.readNetFromDarknet('yolov3.cfg', 'yolov3.weights')
classes = []
with open("coco.names","r") as f:
    classes = [line.strip() for line in f.readlines()]
    
layer_names = net.getLayerNames()
outputlayers = [layer_names[i[0] - 1] for i in net.getUnconnectedOutLayers()]
colors= np.random.uniform(0,255,size=(len(classes),3))

and finally you just run it with the simple code

main()

you can stop with simple press q

for example you want to identiy a Youtube video, of one beautiful girl

or this video https://youtu.be/QW-qWS3StZg?t=170

or the classic traffic recognition https://youtu.be/7HaJArMDKgI

Owner
Ruslan Magana Vsevolodovna
I am Data Scientist and Data Engineer. I have a Ph.D. in Physics and I am AWS certified in Machine Learning and Data Analytics
Ruslan Magana Vsevolodovna
A simple OCR API server, seriously easy to be deployed by Docker, on Heroku as well

ocrserver Simple OCR server, as a small working sample for gosseract. Try now here https://ocr-example.herokuapp.com/, and deploy your own now. Deploy

Hiromu OCHIAI 541 Dec 28, 2022
GDB python tool to pretty print and debug c++ xtensor containers

gdb_xt2np GDB python tool to pretty print, examine, and debug c++ Xtensor containers. Xtensor is a c++ library for scientific computing using multidim

Christopher Burke 4 Oct 29, 2021
A curated list of papers, code and resources pertaining to image composition

A curated list of resources including papers, datasets, and relevant links pertaining to image composition.

BCMI 391 Dec 30, 2022
Amazing 3D explosion animation using Pygame module.

3D Explosion Animation 💣 💥 🔥 Amazing explosion animation with Pygame. 💣 Explosion physics An Explosion instance is made of a set of Particle objec

Dylan Tintenfich 12 Mar 11, 2022
aardio的opencv库

opencv_aardio dll库下载地址:https://github.com/xuncv/opencv-plugin/releases import cv2 img = cv2.imread("./images/Lena.jpg",1) img = cv2.medianBlur(img,5)

71 Dec 31, 2022
Driver Drowsiness Detection with OpenCV & Dlib

In this project, we have built a driver drowsiness detection system that will detect if the eyes of the driver are close for too long and infer if the driver is sleepy or inactive.

Mansi Mishra 4 Oct 26, 2022
一款基于Qt与OpenCV的仿真数字示波器

一款基于Qt与OpenCV的仿真数字示波器

郭赟 4 Nov 02, 2022
TextBoxes re-implement using tensorflow

TextBoxes-TensorFlow TextBoxes re-implementation using tensorflow. This project is greatly inspired by slim project And many functions are modified ba

Gu Xiaodong 44 Dec 29, 2022
Convolutional Recurrent Neural Network (CRNN) for image-based sequence recognition.

Convolutional Recurrent Neural Network This software implements the Convolutional Recurrent Neural Network (CRNN), a combination of CNN, RNN and CTC l

Baoguang Shi 2k Dec 31, 2022
Demo processor to illustrate OCR-D Python API

ocrd_vandalize/ Demo processor to illustrate the OCR-D/core Python API Description :TODO: write docs :) Installation From PyPI pip3 install ocrd_vanda

Konstantin Baierer 5 May 05, 2022
1st place solution for SIIM-FISABIO-RSNA COVID-19 Detection Challenge

SIIM-COVID19-Detection Source code of the 1st place solution for SIIM-FISABIO-RSNA COVID-19 Detection Challenge. 1.INSTALLATION Ubuntu 18.04.5 LTS CUD

Nguyen Ba Dung 170 Dec 21, 2022
Image Detector and Convertor App created using python's Pillow, OpenCV, cvlib, numpy and streamlit packages.

Image Detector and Convertor App created using python's Pillow, OpenCV, cvlib, numpy and streamlit packages.

Siva Prakash 11 Jan 02, 2022
This pyhton script converts a pdf to Image then using tesseract as OCR engine converts Image to Text

Script_Convertir_PDF_IMG_TXT Este script de pyhton convierte un pdf en Imagen luego utilizando tesseract como motor OCR convierte la Imagen a Texto. p

alebogado 1 Jan 27, 2022
Automatically download multiple papers by keywords in CVPR

CVFPaperHelper Automatically download multiple papers by keywords in CVPR Install mkdir PapersToRead cd PaperToRead pip install requests tqdm git clon

46 Jun 08, 2022
Course material for the Multi-agents and computer graphics course

TC2008B Course material for the Multi-agents and computer graphics course. Setup instructions Strongly recommend using a custom conda environment. Ins

16 Dec 13, 2022
This repo contains several opencv projects done while learning opencv in python.

opencv-projects-python This repo contains both several opencv projects done while learning opencv by python and opencv learning resources [Basic conce

Fatin Shadab 2 Nov 03, 2022
TensorFlow Implementation of FOTS, Fast Oriented Text Spotting with a Unified Network.

FOTS: Fast Oriented Text Spotting with a Unified Network I am still working on this repo. updates and detailed instructions are coming soon! Table of

Masao Taketani 52 Nov 11, 2022
【Auto】原神⭐钓鱼辅助工具 | 自动收竿、校准游标 | ✨您只需要抛出鱼竿,我们会帮你完成一切✨

原神钓鱼辅助工具 ✨ 作者正在努力重构代码中……会尽快带给大家一个更完美的脚本 ✨ 「您只需抛出鱼竿,然后我们会帮您搞定一切」 如果你觉得这个脚本好用,请点一个 Star ⭐ ,你的 Star 就是作者更新最大的动力 点击这里 查看演示视频 ✨ 欢迎大家在 Issues 中分享自己的配置文件 ✨ ✨

261 Jan 02, 2023
2 telegram-bots: for image recognition and for text generation

💻 📱 Telegram_Bots 🔎 & 📖 2 telegram-bots: for image recognition and for text generation. About Image recognition bot: User sends a photo and bot de

Marina Polukoshko 1 Jan 27, 2022
document image degradation

ocrodeg The ocrodeg package is a small Python library implementing document image degradation for data augmentation for handwriting recognition and OC

NVIDIA Research Projects 134 Nov 18, 2022