A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset.

Overview

MUGEN Dataset

Project Page | Paper

Setup

conda create --name MUGEN python=3.6
conda activate MUGEN
pip install --ignore-installed https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.12.0-cp36-cp36m-linux_x86_64.whl 
module load cuda/9.0
module load cudnn/v7.4-cuda.10.0
git clone coinrun_MUGEN
cd coinrun_MUGEN
pip install -r requirements.txt
conda install -c conda-forge mpi4py
pip install -e .

Training Agents

Basic training commands:

python -m coinrun.train_agent --run-id myrun --save-interval 1

After each parameter update, this will save a copy of the agent to ./saved_models/. Results are logged to /tmp/tensorflow by default.

Run parallel training using MPI:

mpiexec -np 8 python -m coinrun.train_agent --run-id myrun

Train an agent on a fixed set of N levels. With N = 0, the training set is unbounded.

python -m coinrun.train_agent --run-id myrun --num-levels N

Continue training an agent from a checkpoint:

python -m coinrun.train_agent --run-id newrun --restore-id myrun

View training options:

python -m coinrun.train_agent --help

Example commands for MUGEN agents:

Base model

python -m coinrun.train_agent --run-id name_your_agent \
                --architecture impala --paint-vel-info 1 --dropout 0.0 --l2-weight 0.0001 \
                --num-levels 0 --use-lstm 1 --num-envs 96 --set-seed 80 \
                --bump-head-penalty 0.25 -kill-monster-reward 10.0

Add squat penalty to reduce excessive squating

python -m coinrun.train_agent --run-id gamev2_fine_tune_m4_squat_penalty \
                --architecture impala --paint-vel-info 1 --dropout 0.0 --l2-weight 0.0001 \
                --num-levels 0 --use-lstm 1 --num-envs 96 --set-seed 811 \
                --bump-head-penalty 0.1 --kill-monster-reward 5.0 --squat-penalty 0.1 \
                --restore-id gamev2_fine_tune_m4_0

Larger model

python -m coinrun.train_agent --run-id gamev2_largearch_bump_head_penalty_0.05_0 \
                --architecture impalalarge --paint-vel-info 1 --dropout 0.0 --l2-weight 0.0001 \
                --num-levels 0 --use-lstm 1 --num-envs 96 --set-seed 51 \
                --bump-head-penalty 0.05 -kill-monster-reward 10.0

Add reward for dying

python -m coinrun.train_agent --run-id gamev2_fine_tune_squat_penalty_die_reward_3.0 \
                --architecture impala --paint-vel-info 1 --dropout 0.0 --l2-weight 0.0001 \
                --num-levels 0 --use-lstm 1 --num-envs 96 --set-seed 857 \
                --bump-head-penalty 0.1 --kill-monster-reward 5.0 --squat-penalty 0.1 \
                --restore-id gamev2_fine_tune_m4_squat_penalty --die-penalty -3.0

Add jump penalty

python -m coinrun.train_agent --run-id gamev2_fine_tune_m4_jump_penalty \
                --architecture impala --paint-vel-info 1 --dropout 0.0 --l2-weight 0.0001 \
                --num-levels 0 --use-lstm 1 --num-envs 96 --set-seed 811 \
                --bump-head-penalty 0.1 --kill-monster-reward 10.0 --jump-penalty 0.1 \
                --restore-id gamev2_fine_tune_m4_0

Data Collection

Collect video data with trained agent. The following command will create a folder {save_dir}/{model_name}_seed_{seed}, which contains the audio semantic maps to reconstruct game audio, as well as the csv containing all game metadata. We use the csv for reconstructing video data in the next step.

python -m coinrun.collect_data --collect_data --paint-vel-info 1 \
                --set-seed 406 --restore-id gamev2_fine_tune_squat_penalty_timeout_300 \
                --save-dir  \
                --level-timeout 600 --num-levels-to-collect 2000

The next step is to create 3.2 second videos with audio by running the script gen_videos.sh. This script first parses the csv metadata of agent gameplay into a json format. Then, we sample 3 second clips, render to RGB, generate audio, and save .mp4s. Note that we apply some sampling logic in gen_videos.py to only generate videos for levels of sufficient length and with interesting game events. You can adjust the sampling logic to your liking here.

There are three outputs from this script:

  1. ./json_metadata - where full level jsons are saved for longer video rendering
  2. ./video_metadata - where 3.2 second video jsons are saved
  3. ./videos - where 3.2s .mp4 videos with audio are saved. We use these videos for human annotation.
bash gen_videos.sh  

For example:

bash gen_videos.sh video_data model_gamev2_fine_tune_squat_penalty_timeout_300_seed_406

License Info

The majority of MUGEN is licensed under CC-BY-NC, however portions of the project are available under separate license terms: CoinRun, VideoGPT, VideoCLIP, and S3D are licensed under the MIT license; Tokenizer is licensed under the Apache 2.0 Pycocoevalcap is licensed under the BSD license; VGGSound is licensed under the CC-BY-4.0 license.

Owner
MUGEN
MUGEN
Optical machine for senses sensing using speckle and deep learning

# Senses-speckle [Remote Photonic Detection of Human Senses Using Secondary Speckle Patterns](https://doi.org/10.21203/rs.3.rs-724587/v1) paper Python

Zeev Kalyuzhner 0 Sep 26, 2021
This package contains a PyTorch Implementation of IB-GAN of the submitted paper in AAAI 2021

The PyTorch implementation of IB-GAN model of AAAI 2021 This package contains a PyTorch implementation of IB-GAN presented in the submitted paper (IB-

Insu Jeon 9 Mar 30, 2022
Code repository for the paper: Hierarchical Kinematic Probability Distributions for 3D Human Shape and Pose Estimation from Images in the Wild (ICCV 2021)

Hierarchical Kinematic Probability Distributions for 3D Human Shape and Pose Estimation from Images in the Wild Akash Sengupta, Ignas Budvytis, Robert

Akash Sengupta 149 Dec 14, 2022
JstDoS - HTTP Protocol Stack Remote Code Execution Vulnerability

jstDoS If you are going to skid that, please give credits ! ^^ ¿How works? This

apolo 4 Feb 11, 2022
This is a Python Module For Encryption, Hashing And Other stuff

EnroCrypt This is a Python Module For Encryption, Hashing And Other Basic Stuff You Need, With Secure Encryption And Strong Salted Hashing You Can Do

5 Sep 15, 2022
Select, weight and analyze complex sample data

Sample Analytics In large-scale surveys, often complex random mechanisms are used to select samples. Estimates derived from such samples must reflect

samplics 37 Dec 15, 2022
Code release for NeurIPS 2020 paper "Co-Tuning for Transfer Learning"

CoTuning Official implementation for NeurIPS 2020 paper Co-Tuning for Transfer Learning. [News] 2021/01/13 The COCO 70 dataset used in the paper is av

THUML @ Tsinghua University 35 Sep 23, 2022
links and status of cool gradio demos

awesome-demos This is a list of some wonderful demos & applications built with Gradio. Here's how to contribute yours! 🖊️ Natural language processing

Gradio 96 Dec 30, 2022
A generalist algorithm for cell and nucleus segmentation.

Cellpose | A generalist algorithm for cell and nucleus segmentation. Cellpose was written by Carsen Stringer and Marius Pachitariu. To learn about Cel

MouseLand 733 Dec 29, 2022
Bayesian optimization in PyTorch

BoTorch is a library for Bayesian Optimization built on PyTorch. BoTorch is currently in beta and under active development! Why BoTorch ? BoTorch Prov

2.5k Dec 31, 2022
RoboDesk A Multi-Task Reinforcement Learning Benchmark

RoboDesk A Multi-Task Reinforcement Learning Benchmark If you find this open source release useful, please reference in your paper: @misc{kannan2021ro

Google Research 66 Oct 07, 2022
Blender Python - Node-based multi-line text and image flowchart

MindMapper v0.8 Node-based text and image flowchart for Blender Mindmap with shortcuts visible: Mindmap with shortcuts hidden: Notes This was requeste

SpectralVectors 58 Oct 08, 2022
An end-to-end implementation of intent prediction with Metaflow and other cool tools

You Don't Need a Bigger Boat An end-to-end (Metaflow-based) implementation of an intent prediction flow for kids who can't MLOps good and wanna learn

Jacopo Tagliabue 614 Dec 31, 2022
Prediction of MBA refinance Index (Mortgage prepayment)

Prediction of MBA refinance Index (Mortgage prepayment) Deep Neural Network based Model The ability to predict mortgage prepayment is of critical use

Ruchil Barya 1 Jan 16, 2022
This is a repository of our model for weakly-supervised video dense anticipation.

Introduction This is a repository of our model for weakly-supervised video dense anticipation. More results on GTEA, Epic-Kitchens etc. will come soon

2 Apr 09, 2022
Python 3 module to print out long strings of text with intervals of time inbetween

Python-Fastprint Python 3 module to print out long strings of text with intervals of time inbetween Install: pip install fastprint Sync Usage: from fa

Kainoa Kanter 2 Jun 27, 2022
A particular navigation route using satellite feed and can help in toll operations & traffic managemen

How about adding some info that can quanitfy the stress on a particular navigation route using satellite feed and can help in toll operations & traffic management The current analysis is on the satel

Ashish Pandey 1 Feb 14, 2022
Realtime YOLO Monster Detection With Non Maximum Supression

Realtime-YOLO-Monster-Detection-With-Non-Maximum-Supression Table of Contents In

5 Oct 07, 2022
Single/multi view image(s) to voxel reconstruction using a recurrent neural network

3D-R2N2: 3D Recurrent Reconstruction Neural Network This repository contains the source codes for the paper Choy et al., 3D-R2N2: A Unified Approach f

Chris Choy 1.2k Dec 27, 2022
Experiments on Flood Segmentation on Sentinel-1 SAR Imagery with Cyclical Pseudo Labeling and Noisy Student Training

Flood Detection Challenge This repository contains code for our submission to the ETCI 2021 Competition on Flood Detection (Winning Solution #2). Acco

Siddha Ganju 108 Dec 28, 2022