Yet another video caption

Last update: May 26, 2022

Related tags

Deep Learning yet-another-video-caption

Overview

yet-another-video-caption

数据集配置

准备数据集

将原始数据集重新组织成统一的格式后，放置于 ./dataset 中。

数据集的组织格式为：

./dataset
    train/
        video/
            *.avi
        ...
        info.json
    test/
        video/ 
            *.avi
        ...

自动配置

通常你只需要使用数据集的一个子集，此时请考虑运行自动抽取脚本 makedata.py。

所有数据位于 ./data 中。

所有视频（包括 train/val/test）位于 ./data/video 中。

所有视频信息（包括 train/val/test）输入到 ./data/input.json。

程序会在 ./data 中产生一些中间信息，请勿修改。

依赖

pip install tqdm pillow pretrainedmodels nltk

此外，请确保已当前环境下已经正确配置 CUDA 运行库，CUDNN，Pytorch(GPU)，ffmpeg，JDK

食用步骤

确保数据集已正确配置
确保依赖已经正确安装
抽取数据，将你希望使用的 train/val/test 划分参数输入 makedata.py 中，然后执行该脚本
依次执行（请自行修改 batch_size 和 saved_model 参数！）

python prepro_feats.py --output_dir data/feats/resnet152 --model resnet152
python prepro_vocab.py
python train.py --epochs 3001 --batch_size 1 --checkpoint_path data/save --feats_dir data/feats/resnet152 --model S2VTAttModel --with_c3d 0 --dim_vid 2048
python eval.py --recover_opt data/save/opt_info.json --saved_model data/save/model_10.pth --batch_size 1

速度测试

以下结果测试于单张 2080Ti

预处理（ResNet152 特征提取）：共 40min

训练速度（batch_size=32）：6.20 it/s

Todo

大小写问题

References

https://github.com/xiadingZ/video-caption.pytorch

Yet another video caption

Related tags

Overview

yet-another-video-caption

数据集配置

准备数据集

自动配置

依赖

食用步骤

速度测试

Todo

References

Owner

Fan Zhimin

Patch-Based Deep Autoencoder for Point Cloud Geometry Compression

Image super-resolution through deep learning

Office source code of paper UniFuse: Unidirectional Fusion for 360$^\circ$ Panorama Depth Estimation

[CVPR 2021] Rethinking Text Segmentation: A Novel Dataset and A Text-Specific Refinement Approach

A curated list of awesome resources related to Semantic Search🔎 and Semantic Similarity tasks.

GyroSPD: Vector-valued Distance and Gyrocalculus on the Space of Symmetric Positive Definite Matrices

Unsupervised 3D Human Mesh Recovery from Noisy Point Clouds

Class-Balanced Loss Based on Effective Number of Samples. CVPR 2019

Matching python environment code for Lux AI 2021 Kaggle competition, and a gym interface for RL models.

Iowa Project - My second project done at General Assembly, focused on feature engineering and understanding Linear Regression as a concept

Improving Factual Consistency of Abstractive Text Summarization

Mix3D: Out-of-Context Data Augmentation for 3D Scenes (3DV 2021)

Simulation code and tutorial for BBHnet training data

This is the codebase for the ICLR 2021 paper Trajectory Prediction using Equivariant Continuous Convolution

PyTorch Code for the paper "VSE++: Improving Visual-Semantic Embeddings with Hard Negatives"

Implementation of algorithms for continuous control (DDPG and NAF).

An implementation of "MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing" (ICML 2019).

The MLOps platform for innovators 🚀

PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!