VIsually-Pivoted Audio and(N) Text

Last update: Nov 04, 2022

Related tags

Overview

VIP-ANT: VIsually-Pivoted Audio and(N) Text

Code for the paper Connecting the Dots between Audio and Text without Parallel Data through Visual Knowledge Transfer.

Data

AudioSet can be downloaded and preprocessed via this tool.

Vision-Audio (VA) Pre-training

Check out the running script bash/run_bimodal_va.sh.

Audio-Text (AT) Fine-tuning

Check out the running script bash/run_bimodal_at.sh.

Dependencies

Dockerfile defines the minimum dependencies of the repo.

Citing VIP-ANT

@misc{vip-ant,
      title={Connecting the Dots between Audio and Text without Parallel Data through Visual Knowledge Transfer},
      author={Yanpeng Zhao and Jack Hessel and Youngjae Yu and Ximing Lu and Rowan Zellers and Yejin Choi},
      url={https://arxiv.org/abs/2112.08995},
      archivePrefix={arXiv},
      primaryClass={cs.SD},
      eprint={2112.08995},
      year={2021},
}

License

MIT

Owner

Yän.PnG

GitHub Repository https://arxiv.org/abs/2112.08995

A curated list of awesome game datasets, and tools to artificial intelligence in games

🎮 Awesome Game Datasets In computer science, Artificial Intelligence (AI) is intelligence demonstrated by machines. Its definition, AI research as th

454 Jan 03, 2023

Real-Time Seizure Detection using EEG: A Comprehensive Comparison of Recent Approaches under a Realistic Setting

Real-Time Seizure Detection using Electroencephalogram (EEG) This is the repository for "Real-Time Seizure Detection using EEG: A Comprehensive Compar

30 Dec 17, 2022

Official Repsoitory for "Activate or Not: Learning Customized Activation." [CVPR 2021]

CVPR 2021 | Activate or Not: Learning Customized Activation. This repository contains the official Pytorch implementation of the paper Activate or Not

184 Dec 27, 2022

Segmentation vgg16 fcn - cityscapes

VGGSegmentation Segmentation vgg16 fcn - cityscapes Priprema skupa skripta prepare_dataset_downsampled.py Iz slika cityscapesa izrezuje haubu automobi

6 Oct 24, 2020

A data-driven approach to quantify the value of classifiers in a machine learning ensemble.

Documentation | External Resources | Research Paper Shapley is a Python library for evaluating binary classifiers in a machine learning ensemble. The

188 Dec 29, 2022

RAMA: Rapid algorithm for multicut problem

RAMA: Rapid algorithm for multicut problem Solves multicut (correlation clustering) problems orders of magnitude faster than CPU based solvers without

60 Dec 13, 2022

This repo includes our code for evaluating and improving transferability in domain generalization (NeurIPS 2021)

Transferability for domain generalization This repo is for evaluating and improving transferability in domain generalization (NeurIPS 2021), based on

9 Nov 29, 2022

Classifying audio using Wavelet transform and deep learning

Audio Classification using Wavelet Transform and Deep Learning A step-by-step tutorial to classify audio signals using continuous wavelet transform (C

17 Nov 29, 2022

An executor that performs image segmentation on fashion items

ClothingSegmenter U2NET fashion image/clothing segmenter based on https://github.com/levindabhi/cloth-segmentation Overview The ClothingSegmenter exec

5 Mar 30, 2022

Pytorch implementation of YOLOX、PPYOLO、PPYOLOv2、FCOS an so on.

简体中文 | English miemiedetection 概述 miemiedetection是女装大佬咩酱基于YOLOX进行二次开发的个人检测库（使用的深度学习框架为pytorch），支持Windows、Linux系统，以女装大佬咩酱的名字命名。miemiedetection是一个不需要安装的

248 Jan 02, 2023

Visualization toolkit for neural networks in PyTorch! Demo -->

FlashTorch A Python visualization toolkit, built with PyTorch, for neural networks in PyTorch. Neural networks are often described as "black box". The

692 Dec 29, 2022

Code for Ditto: Building Digital Twins of Articulated Objects from Interaction

Ditto: Building Digital Twins of Articulated Objects from Interaction Zhenyu Jiang, Cheng-Chun Hsu, Yuke Zhu CVPR 2022, Oral Project | arxiv News 2022

78 Dec 22, 2022

Image Segmentation and Object Detection in Pytorch

Image Segmentation and Object Detection in Pytorch Pytorch-Segmentation-Detection is a library for image segmentation and object detection with report

732 Dec 10, 2022

Official implementation of the paper: "LDNet: Unified Listener Dependent Modeling in MOS Prediction for Synthetic Speech"

LDNet Author: Wen-Chin Huang (Nagoya University) Email: Wen-Chin Huang (unilight) 40 Nov 20, 2022

ServiceX Transformer that converts flat ROOT ntuples into columnwise data

ServiceX_Uproot_Transformer ServiceX Transformer that converts flat ROOT ntuples into columnwise data Usage You can invoke the transformer from the co

0 Jan 20, 2022

ML powered analytics engine for outlier detection and root cause analysis.

Website • Docs • Blog • LinkedIn • Community Slack ML powered analytics engine for outlier detection and root cause analysis ✨ What is Chaos Genius? C

523 Jan 04, 2023

Official Pytorch implementation of "Unbiased Classification Through Bias-Contrastive and Bias-Balanced Learning (NeurIPS 2021)

Unbiased Classification Through Bias-Contrastive and Bias-Balanced Learning (NeurIPS 2021) Official Pytorch implementation of Unbiased Classification

17 Jan 01, 2023

Pytorch implementation of Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization https://arxiv.org/abs/2008.11646

[TCSVT] Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization LPN [Paper] NEWs Prerequisites Python 3.6 GPU Memory = 8G Numpy 1.

46 Dec 14, 2022

Deep Reinforcement Learning based Trading Agent for Bitcoin

Deep Trading Agent Deep Reinforcement Learning based Trading Agent for Bitcoin using DeepSense Network for Q function approximation. For complete deta

669 Dec 29, 2022

[ICLR 2021] "CPT: Efficient Deep Neural Network Training via Cyclic Precision" by Yonggan Fu, Han Guo, Meng Li, Xin Yang, Yining Ding, Vikas Chandra, Yingyan Lin

CPT: Efficient Deep Neural Network Training via Cyclic Precision Yonggan Fu, Han Guo, Meng Li, Xin Yang, Yining Ding, Vikas Chandra, Yingyan Lin Accep

26 Oct 25, 2022