SAAVN - Sound Adversarial Audio-Visual Navigation,ICLR2022 (In PyTorch)

Last update: Aug 30, 2022

Related tags

Deep Learning SAAVN

Overview

SAAVN

SAAVN Code release for paper "Sound Adversarial Audio-Visual Navigation,ICLR2022" (In PyTorch)

These code are under cleaning! Some of bugs maybe happen, please tell me if you have any trouble.

Thanks

These codes are based on the SoundSpaces code base.

Usage

This repo supports AudioGoal Task on Replica and Matterport3D datasets.

Below we show the commands for training and evaluating AudioGoal with Depth sensor on Replica, but it applies to Matterport dataset as well.

Training

python main.py --default av_nav --run-type train --exp-config [exp_config_file] --model-dir data/models/replica/av_nav/e0000/audiogoal_depth --tag-config [tag_config_file] TORCH_GPU_ID 0 SIMULATOR_GPU_ID 0

Validation (evaluate each checkpoint and generate a validation curve)

python main.py --default av_nav --run-type eval --exp-config [exp_config_file] --model-dir data/models/replica/av_nav/e0000/audiogoal_depth --tag-config [tag_config_file] TORCH_GPU_ID 0 SIMULATOR_GPU_ID 0

Test the best validation checkpoint based on validation curve

python main.py --default av_nav --run-type eval --exp-config [exp_config_file] --model-dir data/models/replica/av_nav/e0000/audiogoal_depth --tag-config [tag_config_file] TORCH_GPU_ID 0 SIMULATOR_GPU_ID 0

Generate demo video with audio

python main.py --default av_nav --run-type eval --exp-config [exp_config_file] --model-dir data/models/replica/av_nav/e0000/audiogoal_depth --tag-config [tag_config_file] TORCH_GPU_ID 0 SIMULATOR_GPU_ID 0

Note: [exp_config_file] is the main parameter configuration file of the experiment, while [tag_config_file] is special parameter configuration file for abalation experiments.

Citation

If you use this model in your research, please cite the following paper:

@inproceedings{YinfengICLR2022saavn,
	title = {Sound Adversarial Audio-Visual Navigation},
	author = {Yinfeng Yu, Wenbing Huang, Fuchun Sun, Changan Chen, Yikai Wang, Xiaohong Liu},
	year = {2022},
        booktitle={ICLR},
}

SAAVN - Sound Adversarial Audio-Visual Navigation,ICLR2022 (In PyTorch)

Related tags

Overview

SAAVN

SAAVN Code release for paper "Sound Adversarial Audio-Visual Navigation,ICLR2022" (In PyTorch)

These code are under cleaning! Some of bugs maybe happen, please tell me if you have any trouble.

Thanks

Usage

Citation

Owner

YinfengYu

Pytorch and Torch testing code of CartoonGAN

PyTorch implementation of DUL (Data Uncertainty Learning in Face Recognition, CVPR2020)

Learn the Deep Learning for Computer Vision in three steps: theory from base to SotA, code in PyTorch, and space-repetition with Anki

RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation

The official repository for paper ''Domain Generalization for Vision-based Driving Trajectory Generation'' submitted to ICRA 2022

QAHOI: Query-Based Anchors for Human-Object Interaction Detection (paper)

ProFuzzBench - A Benchmark for Stateful Protocol Fuzzing

Rank1 Conversation Emotion Detection Task

Videocaptioning.pytorch - A simple implementation of video captioning

PyTorch implementation of "A Two-Stage End-to-End System for Speech-in-Noise Hearing Aid Processing"

The easiest tool for extracting radiomics features and training ML models on them.

EvoJAX is a scalable, general purpose, hardware-accelerated neuroevolution toolkit

Deep Inside Convolutional Networks - This is a caffe implementation to visualize the learnt model

Official codebase for ICLR oral paper Unsupervised Vision-Language Grammar Induction with Shared Structure Modeling

Cookiecutter PyTorch Lightning

Demo for Real-time RGBD-based Extended Body Pose Estimation paper

Scale-aware Automatic Augmentation for Object Detection (CVPR 2021)

Official pytorch implementation of the paper: "SinGAN: Learning a Generative Model from a Single Natural Image"

(EI 2022) Controllable Confidence-Based Image Denoising

Tensorflow port of a full NetVLAD network