A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset.

Overview

MUGEN Dataset

Project Page | Paper

Setup

conda create --name MUGEN python=3.6
conda activate MUGEN
pip install --ignore-installed https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.12.0-cp36-cp36m-linux_x86_64.whl 
module load cuda/9.0
module load cudnn/v7.4-cuda.10.0
git clone coinrun_MUGEN
cd coinrun_MUGEN
pip install -r requirements.txt
conda install -c conda-forge mpi4py
pip install -e .

Training Agents

Basic training commands:

python -m coinrun.train_agent --run-id myrun --save-interval 1

After each parameter update, this will save a copy of the agent to ./saved_models/. Results are logged to /tmp/tensorflow by default.

Run parallel training using MPI:

mpiexec -np 8 python -m coinrun.train_agent --run-id myrun

Train an agent on a fixed set of N levels. With N = 0, the training set is unbounded.

python -m coinrun.train_agent --run-id myrun --num-levels N

Continue training an agent from a checkpoint:

python -m coinrun.train_agent --run-id newrun --restore-id myrun

View training options:

python -m coinrun.train_agent --help

Example commands for MUGEN agents:

Base model

python -m coinrun.train_agent --run-id name_your_agent \
                --architecture impala --paint-vel-info 1 --dropout 0.0 --l2-weight 0.0001 \
                --num-levels 0 --use-lstm 1 --num-envs 96 --set-seed 80 \
                --bump-head-penalty 0.25 -kill-monster-reward 10.0

Add squat penalty to reduce excessive squating

python -m coinrun.train_agent --run-id gamev2_fine_tune_m4_squat_penalty \
                --architecture impala --paint-vel-info 1 --dropout 0.0 --l2-weight 0.0001 \
                --num-levels 0 --use-lstm 1 --num-envs 96 --set-seed 811 \
                --bump-head-penalty 0.1 --kill-monster-reward 5.0 --squat-penalty 0.1 \
                --restore-id gamev2_fine_tune_m4_0

Larger model

python -m coinrun.train_agent --run-id gamev2_largearch_bump_head_penalty_0.05_0 \
                --architecture impalalarge --paint-vel-info 1 --dropout 0.0 --l2-weight 0.0001 \
                --num-levels 0 --use-lstm 1 --num-envs 96 --set-seed 51 \
                --bump-head-penalty 0.05 -kill-monster-reward 10.0

Add reward for dying

python -m coinrun.train_agent --run-id gamev2_fine_tune_squat_penalty_die_reward_3.0 \
                --architecture impala --paint-vel-info 1 --dropout 0.0 --l2-weight 0.0001 \
                --num-levels 0 --use-lstm 1 --num-envs 96 --set-seed 857 \
                --bump-head-penalty 0.1 --kill-monster-reward 5.0 --squat-penalty 0.1 \
                --restore-id gamev2_fine_tune_m4_squat_penalty --die-penalty -3.0

Add jump penalty

python -m coinrun.train_agent --run-id gamev2_fine_tune_m4_jump_penalty \
                --architecture impala --paint-vel-info 1 --dropout 0.0 --l2-weight 0.0001 \
                --num-levels 0 --use-lstm 1 --num-envs 96 --set-seed 811 \
                --bump-head-penalty 0.1 --kill-monster-reward 10.0 --jump-penalty 0.1 \
                --restore-id gamev2_fine_tune_m4_0

Data Collection

Collect video data with trained agent. The following command will create a folder {save_dir}/{model_name}_seed_{seed}, which contains the audio semantic maps to reconstruct game audio, as well as the csv containing all game metadata. We use the csv for reconstructing video data in the next step.

python -m coinrun.collect_data --collect_data --paint-vel-info 1 \
                --set-seed 406 --restore-id gamev2_fine_tune_squat_penalty_timeout_300 \
                --save-dir  \
                --level-timeout 600 --num-levels-to-collect 2000

The next step is to create 3.2 second videos with audio by running the script gen_videos.sh. This script first parses the csv metadata of agent gameplay into a json format. Then, we sample 3 second clips, render to RGB, generate audio, and save .mp4s. Note that we apply some sampling logic in gen_videos.py to only generate videos for levels of sufficient length and with interesting game events. You can adjust the sampling logic to your liking here.

There are three outputs from this script:

  1. ./json_metadata - where full level jsons are saved for longer video rendering
  2. ./video_metadata - where 3.2 second video jsons are saved
  3. ./videos - where 3.2s .mp4 videos with audio are saved. We use these videos for human annotation.
bash gen_videos.sh  

For example:

bash gen_videos.sh video_data model_gamev2_fine_tune_squat_penalty_timeout_300_seed_406

License Info

The majority of MUGEN is licensed under CC-BY-NC, however portions of the project are available under separate license terms: CoinRun, VideoGPT, VideoCLIP, and S3D are licensed under the MIT license; Tokenizer is licensed under the Apache 2.0 Pycocoevalcap is licensed under the BSD license; VGGSound is licensed under the CC-BY-4.0 license.

Owner
MUGEN
MUGEN
Semantic Image Synthesis with SPADE

Semantic Image Synthesis with SPADE New implementation available at imaginaire repository We have a reimplementation of the SPADE method that is more

NVIDIA Research Projects 7.3k Jan 07, 2023
Self-Supervised Deep Blind Video Super-Resolution

Self-Blind-VSR Paper | Discussion Self-Supervised Deep Blind Video Super-Resolution By Haoran Bai and Jinshan Pan Abstract Existing deep learning-base

Haoran Bai 35 Dec 09, 2022
Implementation for "Manga Filling Style Conversion with Screentone Variational Autoencoder" (SIGGRAPH ASIA 2020 issue)

Manga Filling with ScreenVAE SIGGRAPH ASIA 2020 | Project Website | BibTex This repository is for ScreenVAE introduced in the following paper "Manga F

30 Dec 24, 2022
IJON is an annotation mechanism that analysts can use to guide fuzzers such as AFL.

IJON SPACE EXPLORER IJON is an annotation mechanism that analysts can use to guide fuzzers such as AFL. Using only a small (usually one line) annotati

Chair for Sys­tems Se­cu­ri­ty 146 Dec 16, 2022
Tutorials and implementations for "Self-normalizing networks"

Self-Normalizing Networks Tutorials and implementations for "Self-normalizing networks"(SNNs) as suggested by Klambauer et al. (arXiv pre-print). Vers

Institute of Bioinformatics, Johannes Kepler University Linz 1.6k Jan 07, 2023
EvDistill: Asynchronous Events to End-task Learning via Bidirectional Reconstruction-guided Cross-modal Knowledge Distillation (CVPR'21)

EvDistill: Asynchronous Events to End-task Learning via Bidirectional Reconstruction-guided Cross-modal Knowledge Distillation (CVPR'21) Citation If y

addisonwang 18 Nov 11, 2022
[ICML 2021] Towards Understanding and Mitigating Social Biases in Language Models

Towards Understanding and Mitigating Social Biases in Language Models This repo contains code and data for evaluating and mitigating bias from generat

Paul Liang 42 Jan 03, 2023
A fast model to compute optical flow between two input images.

DCVNet: Dilated Cost Volumes for Fast Optical Flow This repository contains our implementation of the paper: @InProceedings{jiang2021dcvnet, title={

Huaizu Jiang 8 Sep 27, 2021
Official code of "Mitigating the Mutual Error Amplification for Semi-Supervised Object Detection"

CrossTeaching-SSOD 0. Introduction Official code of "Mitigating the Mutual Error Amplification for Semi-Supervised Object Detection" This repo include

Bruno Ma 9 Nov 29, 2022
Code for "MetaMorph: Learning Universal Controllers with Transformers", Gupta et al, ICLR 2022

MetaMorph: Learning Universal Controllers with Transformers This is the code for the paper MetaMorph: Learning Universal Controllers with Transformers

Agrim Gupta 50 Jan 03, 2023
Torch implementation of various types of GAN (e.g. DCGAN, ALI, Context-encoder, DiscoGAN, CycleGAN, EBGAN, LSGAN)

gans-collection.torch Torch implementation of various types of GANs (e.g. DCGAN, ALI, Context-encoder, DiscoGAN, CycleGAN, EBGAN). Note that EBGAN and

Minchul Shin 53 Jan 22, 2022
Randomized Correspondence Algorithm for Structural Image Editing

===================================== README: Inpainting based PatchMatch ===================================== @Author: Younesse ANDAM @Conta

Younesse 116 Dec 24, 2022
Deep Learning Based EDM Subgenre Classification using Mel-Spectrogram and Tempogram Features"

EDM-subgenre-classifier This repository contains the code for "Deep Learning Based EDM Subgenre Classification using Mel-Spectrogram and Tempogram Fea

11 Dec 20, 2022
Research on controller area network Intrusion Detection Systems

Group members information Member 1: Lixue Liang Member 2: Yuet Lee Chan Member 3: Xinruo Zhang Member 4: Yifei Han User Manual Generate Attack Packets

Roche 4 Aug 30, 2022
IOT: Instance-wise Layer Reordering for Transformer Structures

Introduction This repository contains the code for Instance-wise Ordered Transformer (IOT), which is introduced in the ICLR2021 paper IOT: Instance-wi

IOT 19 Nov 15, 2022
Visualization toolkit for neural networks in PyTorch! Demo -->

FlashTorch A Python visualization toolkit, built with PyTorch, for neural networks in PyTorch. Neural networks are often described as "black box". The

Misa Ogura 692 Dec 29, 2022
This project is the official implementation of our accepted ICLR 2021 paper BiPointNet: Binary Neural Network for Point Clouds.

BiPointNet: Binary Neural Network for Point Clouds Created by Haotong Qin, Zhongang Cai, Mingyuan Zhang, Yifu Ding, Haiyu Zhao, Shuai Yi, Xianglong Li

Haotong Qin 59 Dec 17, 2022
Open standard for machine learning interoperability

Open Neural Network Exchange (ONNX) is an open ecosystem that empowers AI developers to choose the right tools as their project evolves. ONNX provides

Open Neural Network Exchange 13.9k Dec 30, 2022
Stitch it in Time: GAN-Based Facial Editing of Real Videos

STIT - Stitch it in Time [Project Page] Stitch it in Time: GAN-Based Facial Edit

1.1k Jan 04, 2023
The devkit of the nuScenes dataset.

nuScenes devkit Welcome to the devkit of the nuScenes and nuImages datasets. Overview Changelog Devkit setup nuImages nuImages setup Getting started w

Motional 1.6k Jan 05, 2023