LeafSnap replicated using deep neural networks to test accuracy compared to traditional computer vision methods.

Overview

Deep-Leafsnap

Convolutional Neural Networks have become largely popular in image tasks such as image classification recently largely due to to Krizhevsky, et al. in their famous paper ImageNet Classification with Deep Convolutional Neural Networks. Famous models such as AlexNet, VGG-16, ResNet-50, etc. have scored state of the art results on image classfication datasets such as ImageNet and CIFAR-10.

We present an application of CNN's to the task of classifying trees by images of their leaves; specifically all 185 types of trees in the United States. This task proves to be difficult for traditional computer vision methods due to the high number of classes, inconsistency in images, and large visual similarity between leaves.

Kumar, et al. developed a automatic visual recognition algorithm in their 2012 paper Leafsnap: A Computer Vision System for Automatic Plant Species Identification to attempt to solve this problem.

Our model is based off VGG-16 except modified to work with 64x64 size inputs. We achieved state of the art results at the time. Our deep learning approach to this problem further improves the accuracy from 70.8% to 86.2% for the top-1 prediction accuracy and from 96.8% to 98.4% for top-5 prediction accuracy.

Top-1 Accuracy Top-5 Accuracy
Leafsnap 70.8% 96.8%
Deep-Leafsnap 86.2% 98.4%

We noticed that our model failed to recognize specific classes of trees constantly causing our overall accuracy to derease. This is primarily due to the fact that those trees had very small leaves which were hard to preprocess and crop. Our training images were also resized to 64x64 due to limited computational resources. We plan on further improving our data preprocessing and increasing our image size to 224x224 in order to exceed 90% for our top-1 prediction acurracy.

The following goes over the code and how to set it up on your own machine.

Files

  • model.py trains a convolutional neural network on the dataset.
  • vgg.py PyTorch model code for VGG-16.
  • densenet.py PyTorch model code for DenseNet-121.
  • resnet.py PyTorch model code for ResNet.
  • dataset.py creates a new train/test dataset by cropping the leaf and augmenting the data.
  • utils.py helps do some of the hardcore image processing in dataset.py.
  • averagemeter.py helper class which keeps track of a bunch of averages when training.
  • leafsnap-dataset-images.csv is the CSV file corresponding to the dataset.
  • requirements.txt contains the pip requirements to run the code.

Installation

To run the models and code make sure you Python installed.

Install PyTorch by following the directions here.

Clone the repo onto your local machine and cd into the directory.

git clone https://github.com/sujithv28/Deep-Leafsnap.git
cd Deep-Leafsnap

Install all the python dependencies:

pip install -r requirements.txt

Make sure sklearn is updated to the latest version.

pip install --upgrade sklearn

Also make sure you have OpenCV installed either through pip or homebrew. You can check if this works by running and making sure nothing complains:

python
import cv2

Download Leafsnap's image data and extract it to the main directory by running in the directory. Original data can be found here.

wget https://www.dropbox.com/s/dp3sk8wpiu9yszg/data.zip?dl=0
unzip -a data.zip?dl=0
rm data.zip?dl=0

Create the Training and Testing Data

To create the dataset, run

python dataset.py

This cleans the dataset by cropping only neccesary portions of the images containing the leaves and also resizes them to 64x64. If you want to change the image size go to utils.py and change img = misc.imresize(img, (64,64))to any size you want.

Training Model

To train the model, run

python model.py
Owner
Sujith Vishwajith
Computer Science & Math @ University of Maryland
Sujith Vishwajith
AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data

AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data [WIP] Unofficial Pytorch implementation of AdaSpeech 2. Requirements : All code written i

Rishikesh (ऋषिकेश) 63 Dec 28, 2022
Text to image synthesis using thought vectors

Text To Image Synthesis Using Thought Vectors This is an experimental tensorflow implementation of synthesizing images from captions using Skip Though

Paarth Neekhara 2.1k Jan 05, 2023
Systemic Evolutionary Chemical Space Exploration for Drug Discovery

SECSE SECSE: Systemic Evolutionary Chemical Space Explorer Chemical space exploration is a major task of the hit-finding process during the pursuit of

64 Dec 16, 2022
Code basis for the paper "Camera Condition Monitoring and Readjustment by means of Noise and Blur" (2021)

Camera Condition Monitoring and Readjustment by means of Noise and Blur This repository contains the source code of the paper: Wischow, M., Gallego, G

7 Dec 22, 2022
Subnet Replacement Attack: Towards Practical Deployment-Stage Backdoor Attack on Deep Neural Networks

Subnet Replacement Attack: Towards Practical Deployment-Stage Backdoor Attack on Deep Neural Networks Official implementation of paper Towards Practic

Xiangyu Qi 8 Dec 30, 2022
Unofficial implementation of MUSIQ (Multi-Scale Image Quality Transformer)

MUSIQ: Multi-Scale Image Quality Transformer Unofficial pytorch implementation of the paper "MUSIQ: Multi-Scale Image Quality Transformer" (paper link

41 Jan 02, 2023
Cognate Detection Repository

Cognate Detection Repository Details This repository contains the data for two publications: Challenge Dataset of Cognates and False Friend Pairs from

Diptesh Kanojia 1 Apr 26, 2022
A Python package for time series augmentation

tsaug tsaug is a Python package for time series augmentation. It offers a set of augmentation methods for time series, as well as a simple API to conn

Arundo Analytics 278 Jan 01, 2023
Course content and resources for the AIAIART course.

AIAIART course This repo will house the notebooks used for the AIAIART course. Part 1 (first four lessons) ran via Discord in September/October 2021.

Jonathan Whitaker 492 Jan 06, 2023
Employee-Managment - Company employee registration software in the face recognition system

Employee-Managment Company employee registration software in the face recognitio

Alireza Kiaeipour 7 Jul 10, 2022
GPT-Code-Clippy (GPT-CC) is an open source version of GitHub Copilot

GPT-Code-Clippy (GPT-CC) is an open source version of GitHub Copilot, a language model -- based on GPT-3, called GPT-Codex -- that is fine-tuned on publicly available code from GitHub.

2.3k Jan 09, 2023
Bravia core script for python

Bravia-Core-Script You need to have a mandatory account If this L3 does not work, try another L3. enjoy

5 Dec 26, 2021
Pytoydl: A toy deep learning framework built upon numpy.

Documents: https://pytoydl.readthedocs.io/zh/latest/ Pytoydl A toy deep learning framework built upon numpy. You can star this repository to keep trac

28 Dec 10, 2022
A new video text spotting framework with Transformer

TransVTSpotter: End-to-end Video Text Spotter with Transformer Introduction A Multilingual, Open World Video Text Dataset and End-to-end Video Text Sp

weijiawu 67 Jan 03, 2023
GUPNet - Geometry Uncertainty Projection Network for Monocular 3D Object Detection

GUPNet This is the official implementation of "Geometry Uncertainty Projection Network for Monocular 3D Object Detection". citation If you find our wo

Yan Lu 103 Dec 28, 2022
Unofficial PyTorch Implementation for HifiFace (https://arxiv.org/abs/2106.09965)

HifiFace — Unofficial Pytorch Implementation Image source: HifiFace: 3D Shape and Semantic Prior Guided High Fidelity Face Swapping (figure 1, pg. 1)

MINDs Lab 218 Jan 04, 2023
The final project of "Applying AI to 2D Medical Imaging Data" of "AI for Healthcare" nanodegree - Udacity.

Pneumonia Detection from X-Rays Project Overview In this project, you will apply the skills that you have acquired in this 2D medical imaging course t

Omar Laham 1 Jan 14, 2022
🍅🍅🍅YOLOv5-Lite: lighter, faster and easier to deploy. Evolved from yolov5 and the size of model is only 1.7M (int8) and 3.3M (fp16). It can reach 10+ FPS on the Raspberry Pi 4B when the input size is 320×320~

YOLOv5-Lite:lighter, faster and easier to deploy Perform a series of ablation experiments on yolov5 to make it lighter (smaller Flops, lower memory, a

pogg 1.5k Jan 05, 2023
An Open-Source Toolkit for Prompt-Learning.

An Open-Source Framework for Prompt-learning. Overview • Installation • How To Use • Docs • Paper • Citation • What's New? Nov 2021: Now we have relea

THUNLP 2.3k Jan 07, 2023
The coda and data for "Measuring Fine-Grained Domain Relevance of Terms: A Hierarchical Core-Fringe Approach" (ACL '21)

We propose a hierarchical core-fringe learning framework to measure fine-grained domain relevance of terms – the degree that a term is relevant to a broad (e.g., computer science) or narrow (e.g., de

Jie Huang 14 Oct 21, 2022