A Pythonic library for Nvidia Codec.

The project is still in active development; expect breaking changes.

Why another Python library for Nvidia Codec?

Comparison to Video-Processing-Framework

Methodologies

VPF is written fully in C++ and uses pybind to expose Python interfaces. PNC is written fully in Python and uses ctypes to access Nvidia C interfaces. Our codes tends to be more concise, less duplicative and easier to read and write.

Performance

Preliminary tests shows little to no difference in terms of performance, because the heavy lifting is done on the GPU anyway. Both library can saturate GPU decoder. PNC uses more CPU than VPF as expected from Python vs. C++, but still negligible (less than 10% of Ryzen 3100 single core for 8K*4K HEVC)

Resource Management

In VPF Surface given to user are not owned by the user. It will be overwritten by new frames which is counter-intuitive; Picture are not exposed to user at all - they are always mapped (post-processed and copied) to Surface so the picture can be ready for new frames. The latter is inefficient when only a subset of Pictures are needed (e.g. screenshots).
The above is because VPF allocates the bare minimum of resources needed for most decoding tasks. PNC allows the user to specify the amount of resources to be allocated for advanced applications. Users own the resources and decide when and whether to deal with them.
Managing resources is not painful: similar to pycuda, we shift the burden of managing host/device resources to the Python garbage collector. Resources (such as Picture and Surface) are automatically freed when the user drops the reference.

Things to come

TODO Cropping and scaling support in postprocessing
TODO Color space conversion from YUV (bt. 601/709, full-range/limit-range) to RGB using pycuda
Encoder

Acknowledgements

Many thanks to @rarzumanyan for all the helps and explanations!

A Pythonic library for Nvidia Codec.

Related tags

Overview

A Pythonic library for Nvidia Codec.

Why another Python library for Nvidia Codec?

Things to come

Acknowledgements

Owner

Zesen Qian

Dynamica causal Bayesian optimisation

From a body shape, infer the anatomic skeleton.

LeafSnap replicated using deep neural networks to test accuracy compared to traditional computer vision methods.

Efficient electromagnetic solver based on rigorous coupled-wave analysis for 3D and 2D multi-layered structures with in-plane periodicity

Pytorch implementation for A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose

This code reproduces the results of the paper, "Measuring Data Leakage in Machine-Learning Models with Fisher Information"

This project aims at building a real-time wide band channel sounder using USRPs

some academic posters as references. May we have in-person poster session soon!

[ICCV 2021] Excavating the Potential Capacity of Self-Supervised Monocular Depth Estimation

Collection of common code that's shared among different research projects in FAIR computer vision team.

ArtEmis: Affective Language for Art

A Survey on Deep Learning Technique for Video Segmentation

Group-Free 3D Object Detection via Transformers

Differential Privacy for Heterogeneous Federated Learning : Utility & Privacy tradeoffs

We present a regularized self-labeling approach to improve the generalization and robustness properties of fine-tuning.

A PyTorch implementation of "Graph Wavelet Neural Network" (ICLR 2019)

Code for the Convolutional Vision Transformer (ConViT)

A way to store images in YAML.

Self-supervised Product Quantization for Deep Unsupervised Image Retrieval - ICCV2021

ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab