GuideDog is an AI/ML-based mobile app designed to assist the lives of the visually impaired, 100% voice-controlled

Last update: Nov 24, 2021

Related tags

Overview

Guidedog

Authors: Kyuhee Jo, Steven Gunarso, Jacky Wang, Raghav Sharma

GuideDog is an AI/ML-based mobile app designed to assist the lives of the visually impaired, 100% voice-controlled. You may as well think of it as "speaking guide dog," as the name suggests. It has three key features based on the scene captured by your mobile phone:

Reads text upon command
Describes the scene around you upon command
Warns you if there is an obstacle in front of you

Check out this demo video to learn more about our app!

Android App

UI/UX
- Simple and Responsive
- Voice Assistant architecture for targeted audience
Libraries / APIs
- GC Speech-to-text and Text-to-Speech
- Android SDK , androidX
- ML Kit object detection and tracking api
- TensorFlow Lite MobileNet Image Classification Model

Backend

Flask API
- Image Captioning
- Optical Character Recognition
Deployment
- Google App Engine
- fast central API with different endpoints

Image Captioning

We used tensorflow to build and train model for image captioning on MS-COCO 2014 based on the paper Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. The model uses standard convolutional network as an encoder to extract features from images (we use Inception V3) and feed the generated features into an attention-based decoder generate sentences. While the paper used LSTM model as a decoder, we use a simpler RNN instead.

GuideDog is an AI/ML-based mobile app designed to assist the lives of the visually impaired, 100% voice-controlled

Related tags

Overview

Guidedog

Android App

Backend

Image Captioning

Get more insights : Devpost

Owner

Kyuhee Jo

constructing maps of intellectual influence from publication data

Repository sharing code and the model for the paper "Rescoring Sequence-to-Sequence Models for Text Line Recognition with CTC-Prefixes"

Finding all things on-prem Microsoft for password spraying and enumeration.

A highly efficient and modular implementation of Gaussian Processes in PyTorch

DRLib：A concise deep reinforcement learning library, integrating HER and PER for almost off policy RL algos.

The CLRS Algorithmic Reasoning Benchmark

Privacy as Code for DSAR Orchestration: Privacy Request automation to fulfill GDPR, CCPA, and LGPD data subject requests.

"Inductive Entity Representations from Text via Link Prediction" @ The Web Conference 2021

Official Implementation of "Tracking Grow-Finish Pigs Across Large Pens Using Multiple Cameras"

A Pythonic library for Nvidia Codec.

Reinforcement learning algorithms in RLlib

N-Person-Check-Checker-Splitter - A calculator app use to divide checks

Official code repository of the paper Learning Associative Inference Using Fast Weight Memory by Schlag et al.

The official implementation of Autoregressive Image Generation using Residual Quantization (CVPR '22)

Minimisation of a negative log likelihood fit to extract the lifetime of the D^0 meson (MNLL2ELDM)

Steerable discovery of neural audio effects

The pytorch implementation of the paper "text-guided neural image inpainting" at MM'2020

Implementation of UNET architecture for Image Segmentation.

Code release for Local Light Field Fusion at SIGGRAPH 2019

Point Cloud Registration using Representative Overlapping Points.