使用深度学习框架提取视频硬字幕；docker容器免安装深度学习库，使用本地api接口使得界面和后端识别分离；

Last update: Aug 06, 2022

Related tags

Overview

extract-video-subtittle

使用深度学习框架提取视频硬字幕；

本地识别无需联网；

CPU识别速度可观；

容器提供API接口；

运行环境

本项目运行环境非常好搭建，我做好了docker容器免安装各种深度学习包；

提供windows界面操作；

容器为CPU版本；

视频演示

https://www.bilibili.com/video/BV18Q4y1f774/

程序说明

1、先启动后端容器实例

docker run -d -p 6666:6666 m986883511/extract_subtitles

2、启动程序

简单介绍页面

1：点击左边按钮连接第一步启动的容器；

2：视频提取字幕的总进度

3：当前视频帧显示的位置，就是视频进度条

4：识别出来的文字会在这里显示一下

3、点击选择视频确认字幕位置

点击选择视频按钮，这时你可以拖动进度条到有字幕的位置；然后点击选择字幕区域；在视频中画一个矩形；

4、点击测试连接API

后端没问题的话，会显示已连通；此时所有步骤准备就绪

5、开始识别

点击请先完成前几步按钮，内部分为这几个步骤

本地通过ffmpeg提取视频声音保存到temp目录（0%-10%）
api通信将声音文件发送到容器内，容器内spleeter库提取声音中人声，结果保存在容器内temp目录，很耗时间，吃CPU和内存（10%-30）
api通信，将人声根据停顿分片，返回分片结果，耗较短的时间（30%-40%）
根据说话分片时间开始识别字幕（40-%100%）

当100%的时候查看temp目录就生成了和视频同名的srt字幕文件

运行后台

后端接口容器地址Docker Hub

此过程可能时间较长，您需要预先安装好好docker，并配置好docker加速器，你可能需要先docker login

docker run -d -p 6666:6666 m986883511/extract_subtitles

本项目缺少文件

因网速墙的问题，大文件推送不上去，可以参考.gitignore中写的

其他

视频提取

# 视频片段提取
ffmpeg -ss 00:15:45 -t 00:02:15 -i test/three_body_3_7.mp4 -vcodec copy -acodec copy test/3body.mp4
# 打包界面程序
C:/Python/Python38-32/Scripts/pyinstaller.exe main.spec

参考资料

本项目中深度学习源代码为/docker/backend

原作者为：https://github.com/YaoFANGUK/video-subtitle-extractor

Comments

提取人声一直没结果

视频是40多分钟的连续剧。CPU版本。之前用YaoFANGUK/video-subtitle-extractor提取字幕很成功也准确，但时间比较长。看到作者用音频分析减少了识别的帧数，所以试了一下。但在提取人声时，已经等待了近50分钟没有结果。而且CPU的占用只有1%左右，这明显不正常。用YaoFANGUK/video-subtitle-extractor整个的耗时可能都没有这么久。另外autosub也是提取音频来语音识别字幕，识别人声也很快，同样的视频几分钟就完了。麻烦作者看看是出了什么问题呢。

opened by royzengyi 2
项目咨询

Hello，我尝试了一下这个软件，感觉还是不错的，不过在实际使用中还是会有不少问题。

我是一个独立开发者，这边愿意付费或者合作来完善一下，让这个项目更具实用性，不知道你有没有兴趣呢?

没有找到联系方式，只好通过issue来试一下，你可以在看到之后删除，谢谢。

我的邮箱是yedaxia#foxmail.com

opened by YeDaxia 1

Releases(0.2.0)

0.2.0(Aug 2, 2021)

1、修复ffmepg缺少dll的问题 2、修改双击exe报错的问题，换成英文名字 3、增加config.json，可配置后端的ip和port 4、修复ffmpeg没有添加到系统PATH的bug
Source code(tar.gz)
Source code(zip)
extract-video-subtittle-v0.2.0.7z(65.84 MB)

Owner

歌者

失去人性，失去很多；失去兽性，失去一切；活着才能燃烧自己。

GitHub Repository

KoCLIP: Korean port of OpenAI CLIP, in Flax

KoCLIP This repository contains code for KoCLIP, a Korean port of OpenAI's CLIP. This project was conducted as part of Hugging Face's Flax/JAX communi

100 Jan 02, 2023

PyTorch implementation of "PatchGame: Learning to Signal Mid-level Patches in Referential Games" to appear in NeurIPS 2021

PatchGame: Learning to Signal Mid-level Patches in Referential Games This repository is the official implementation of the paper - "PatchGame: Learnin

22 Mar 16, 2022

Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation

Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation Requirements This repository needs mmsegmentation Training To train

20 May 28, 2022

PyTorch IPFS Dataset

PyTorch IPFS Dataset IPFSDataset(Dataset) See the jupyter notepad to see how it works and how it interacts with a standard pytorch DataLoader You need

2 Apr 13, 2022

A sequence of Jupyter notebooks featuring the 12 Steps to Navier-Stokes

CFD Python Please cite as: Barba, Lorena A., and Forsyth, Gilbert F. (2018). CFD Python: the 12 steps to Navier-Stokes equations. Journal of Open Sour

2.6k Dec 30, 2022

Official repository of the paper Privacy-friendly Synthetic Data for the Development of Face Morphing Attack Detectors

SMDD-Synthetic-Face-Morphing-Attack-Detection-Development-dataset Official repository of the paper Privacy-friendly Synthetic Data for the Development

10 Dec 12, 2022

Software associated to AAAI paper "Planning with Biological Neurons and Synapses"

jBrain Software associated with the AAAI 2022 paper Francesco D'Amore, Daniel Mitropolsky, Pierluigi Crescenzi, Emanuele Natale, Christos H. Papadimit

1 Apr 10, 2022

ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information

ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information This repository contains code, model, dataset for ChineseBERT at ACL2021. Ch

413 Dec 01, 2022

Deeplab-resnet-101 in Pytorch with Jaccard loss

Deeplab-resnet-101 Pytorch with Lovász hinge loss Train deeplab-resnet-101 with binary Jaccard loss surrogate, the Lovász hinge, as described in http:

95 Apr 15, 2022

Revisiting Self-Training for Few-Shot Learning of Language Model.

SFLM This is the implementation of the paper Revisiting Self-Training for Few-Shot Learning of Language Model. SFLM is short for self-training for few

15 Nov 19, 2022

A High-Quality Real Time Upscaler for Anime Video

Anime4K Anime4K is a set of open-source, high-quality real-time anime upscaling/denoising algorithms that can be implemented in any programming langua

15.7k Jan 06, 2023

A Transformer-Based Siamese Network for Change Detection

ChangeFormer: A Transformer-Based Siamese Network for Change Detection (Under review at IGARSS-2022) Wele Gedara Chaminda Bandara, Vishal M. Patel Her

214 Dec 29, 2022

UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation

UnivNet UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation. Training python train.py --c

55 Dec 26, 2022

A PyTorch implementation of "CoAtNet: Marrying Convolution and Attention for All Data Sizes".

CoAtNet Overview This is a PyTorch implementation of CoAtNet specified in "CoAtNet: Marrying Convolution and Attention for All Data Sizes", arXiv 2021

268 Jan 07, 2023

Adaptive Dropblock Enhanced GenerativeAdversarial Networks for Hyperspectral Image Classification

This repo holds the codes of our paper: Adaptive Dropblock Enhanced GenerativeAdversarial Networks for Hyperspectral Image Classification, which is ac

17 Dec 28, 2022

A Python library for adversarial machine learning focusing on benchmarking adversarial robustness.

ARES This repository contains the code for ARES (Adversarial Robustness Evaluation for Safety), a Python library for adversarial machine learning rese

377 Dec 20, 2022

This repository contains the implementation of the following paper: Cross-Descriptor Visual Localization and Mapping

Cross-Descriptor Visual Localization and Mapping This repository contains the implementation of the following paper: "Cross-Descriptor Visual Localiza

81 Oct 06, 2022

Codebase for ECCV18 "The Sound of Pixels"

Sound-of-Pixels Codebase for ECCV18 "The Sound of Pixels". *This repository is under construction, but the core parts are already there. Environment T

318 Dec 20, 2022

BlueFog Tutorials

BlueFog Tutorials Welcome to the BlueFog tutorials! In this repository, we've put together a collection of awesome Jupyter notebooks. These notebooks

4 Oct 27, 2021

Generate pixel-style avatars with python.

face2pixel Generate pixel-style avatars with python. Run: Clone the project: git clone https://github.com/theodorecooper/face2pixel install requiremen

2 May 11, 2022