This is an example of building a video Question-Answer system using Jina.
The index data is videos with subtitle information. After indexing, you can query with questions in natural language and retrieve the related video together with the timestamp that the corresponding answer appears.
We use the YouTube video as a toy example,
pip install -r requirements.txt
bash scripts/download_data.sh
By default, we index the video file, toy-data/mnnC37ewQI8.mkv
python app.py index
Query with questions,
python app.py query
To run the video search frontend, first set it up locally. You should have Node and Yarn installed on your machine.
cd frontend
yarn
This will install the necessary dependencies.
To run the search frontend, run
yarn dev
You can see the search frontend at http://localhost:3000/
.
The index flow is as below. The sentences are extracted from the subtitle file.
In the other pathway, the sentences of the subtitles are encoded by the DPRTextEncoder
.
The meta information of the sentences together with embeddings are stored in the SimpleIndexer
.
The query flow is as shown below.
- The input query is a question which is encoded into embeddings by using
DPRTextEncoder
. - The embedding of the query question is used to retrieve the sentences from
SimpleIndexer
. - Rank the candidate sentences and extract the exact answers from the sentences by using
DPRReaderRanker
. - Get the timestamp and video uri information about the answer candidates with
Text2Frame
- download the subtitle files
youtube-dl --write-sub --embed-subs -o toy-data/zvXkQkqd2I8 https://www.youtube.com/watch\?v\=zvXkQkqd2I8
Replace
--write-sub
with--write-auto-sub
when there is no subtitle file uploaded manually. This will use the subtitles generated automatically from YouTube.
- run the following
python app.py index
python app.py query