A live streaming chatroom involving multiple modalities, such as voice, gesture, and facial expression

Last update: Dec 02, 2021

Related tags

Networking CS3483-Group5-2021

Overview

HiLive

A live streaming chatroom involving multiple modalities, such as voice, gesture, and facial expression.

Introduction

We focus on demonstrating the design, as well as highlighting the advantages and our considerations on design features, using the knowledge and design principles learned in CS3483.

As mentioned in our previous design reports, the project is motivated by the limited chatroom in the live stream. This interface is designed for a better chatroom experience in a live streaming platform based on Twitch, which is a popular video game live streaming platform. It also could be further generalized and adopted by video platforms like YouTube. Viewers usually type some messages in the chatroom and send some emojis to communicate with others and vibrant the atmosphere in the streaming room. However, the monotonous form of interaction has long been criticized by users. Therefore, we propose HiLive, where people can not only send emojis that are generalized according to their recognized facial expression but also say something and send the transcribed message directly without typing it out themselves. With the gesture recognition function, users are no longer bound to the keyboard, therefore, they can have an immersive live streaming interaction experience.

Features

Front Page

The front page levereage the use if Twitch API, and the OAuth authentication system. Users can login and register to the website with Twitch account. And the front page also provides a list of avaliable Live streams for users to click into it.

Recommended Channels Component

The component is completely reusable, it only takes a data array from the official Twitch API and shuffles the results (Some sort of recommended channels "algorithm")

Facial Expression Recognition

The facial expression recognition to emoji is a interesting feature that can let user interact with streamers directly. We have trained a model with more than 2000 images for different emoji categories. The prdiction outcome is in real-time and can be seen in the demo video we provided. The following figure capture some of the facial expression recognition results.

Speech to Text Component

Speech to Text used the Google Cloud Speech Recognition API to transcribe the voice to text and sent it to the chatroom. The website will capture the users voice with MDN Web Audio API, which means it will streaming the user's voice and activate the speech-to-text once receving the real-time speech data in byte format.

Gesture Recognition Component

The gesture recognition using the repository from Real-time Hand Gesture Recognition with 3D CNNs. Here is the demo video of the outcome.

Figure: A real-time simulation of the architecture with input video from EgoGesture dataset (on left side) and real-time (online) classification scores of each gesture (on right side) are shown, where each class is annotated with different color.

Built With:

Run this locally

To run this project locally, you'll need Node.js installed.

Install dependencies preferably with yarn but you can also use npm install

Create a .env file in the root of the folder based on .env.example.

Run your Next.JS App with yarn dev or npm run dev

Go to localhost:3000 and check out this amazing clone

A live streaming chatroom involving multiple modalities, such as voice, gesture, and facial expression

Related tags

Overview

HiLive

Introduction

Features

Front Page

Recommended Channels Component

Facial Expression Recognition

Speech to Text Component

Gesture Recognition Component

Built With:

Run this locally

Owner

Ryan Yen

BaseSpec is a system that performs a comparative analysis of baseband implementation and the specifications of cellular networks.

SocksFlood, a DoS tools that sends attacks using Socks5 & Socks4

Easy to use gRPC-web client in python

Send files to your friends over network! (100mb max)

This repository contain sample code of gRPC Communication between Python and GoLang

PcapConverter - A project for generating 15min frames out of a .pcap file containing network traffic

Process incoming JSON-RPC requests in Python

The sequel to SquidNet. It has many of the previous features that were in the original script, however a lot of the functions that do not serve much functionality have been removed.

pyWhisker is a Python equivalent of the original Whisker made by Elad Shamir and written in C#.

Network-Shredder is a python based NIDS.

Publish GPU miner info to MQTT

GhostVPN - Simple and lightweight TUI application for CyberGhostVPN

Roadster - Distance to Closest Road Feature Server

An advanced real time threat intelligence framework to identify threats and malicious web traffic on the basis of IP reputation and historical data.

An open source bias lighting program which syncs up colored lights to the contents of your screen.

An API for controlling Wi-Fi connections on Balena devices.

ServerStatus with node management and monitor

EchoDNS - Analyze your DNS traffic super easy, shows all requested DNS traffic

Web service load balancing simulation experiment.

TsuserverMoS - A Python-based server for Attorney Online,