A look-ahead multi-entity Transformer for modeling coordinated agents.

Overview

baller2vec++

This is the repository for the paper:

Michael A. Alcorn and Anh Nguyen. baller2vec++: A Look-Ahead Multi-Entity Transformer For Modeling Coordinated Agents. arXiv. 2021.

To learn statistically dependent agent trajectories, baller2vec++ uses a specially designed self-attention mask to simultaneously process three different sets of features vectors in a single Transformer. The three sets of feature vectors consist of location feature vectors like those found in baller2vec, look-ahead trajectory feature vectors, and starting location feature vectors. This design allows the model to integrate information about concurrent agent trajectories through multiple Transformer layers without seeing the future (in contrast to baller2vec).
Training sample baller2vec baller2vec++

When trained on a dataset of perfectly coordinated agent trajectories, the trajectories generated by baller2vec are completely uncoordinated while the trajectories generated by baller2vec++ are perfectly coordinated.

Ground truth baller2vec baller2vec baller2vec
Ground truth baller2vec++ baller2vec++ baller2vec++

While baller2vec occasionally generates realistic trajectories for the red defender, it also makes egregious errors. In contrast, the trajectories generated by baller2vec++ often seem plausible. The red player was placed last in the player order when generating his trajectory with baller2vec++.

Citation

If you use this code for your own research, please cite:

@article{alcorn2021baller2vec,
   title={\texttt{baller2vec++}: A Look-Ahead Multi-Entity Transformer For Modeling Coordinated Agents},
   author={Alcorn, Michael A. and Nguyen, Anh},
   journal={arXiv preprint arXiv:2104.11980},
   year={2021}
}

Training baller2vec++

Setting up .basketball_profile

After you've cloned the repository to your desired location, create a file called .basketball_profile in your home directory:

nano ~/.basketball_profile

and copy and paste in the contents of .basketball_profile, replacing each of the variable values with paths relevant to your environment. Next, add the following line to the end of your ~/.bashrc:

source ~/.basketball_profile

and either log out and log back in again or run:

source ~/.bashrc

You should now be able to copy and paste all of the commands in the various instructions sections. For example:

echo ${PROJECT_DIR}

should print the path you set for PROJECT_DIR in .basketball_profile.

Installing the necessary Python packages

cd ${PROJECT_DIR}
pip3 install --upgrade -r requirements.txt

Organizing the play-by-play and tracking data

  1. Copy events.zip (which I acquired from here [mirror here] using https://downgit.github.io) to the DATA_DIR directory and unzip it:
mkdir -p ${DATA_DIR}
cp ${PROJECT_DIR}/events.zip ${DATA_DIR}
cd ${DATA_DIR}
unzip -q events.zip
rm events.zip

Descriptions for the various EVENTMSGTYPEs can be found here (mirror here).

  1. Clone the tracking data from here (mirror here) to the DATA_DIR directory:
cd ${DATA_DIR}
git clone [email protected]:linouk23/NBA-Player-Movements.git

A description of the tracking data can be found here.

Generating the training data

cd ${PROJECT_DIR}
nohup python3 generate_game_numpy_arrays.py > data.log &

You can monitor its progress with:

top

or:

ls -U ${GAMES_DIR} | wc -l

There should be 1,262 NumPy arrays (corresponding to 631 X/y pairs) when finished.

Running the training script

Run (or copy and paste) the following script, editing the variables as appropriate.

#!/usr/bin/env bash

JOB=$(date +%Y%m%d%H%M%S)

echo "train:" >> ${JOB}.yaml
task=basketball  # "basketball" or "toy".
echo "  task: ${task}" >> ${JOB}.yaml
if [[ "$task" = "basketball" ]]
then

    echo "  train_valid_prop: 0.95" >> ${JOB}.yaml
    echo "  train_prop: 0.95" >> ${JOB}.yaml
    echo "  train_samples_per_epoch: 20000" >> ${JOB}.yaml
    echo "  valid_samples: 1000" >> ${JOB}.yaml
    echo "  workers: 10" >> ${JOB}.yaml
    echo "  learning_rate: 1.0e-5" >> ${JOB}.yaml
    echo "  patience: 20" >> ${JOB}.yaml

    echo "dataset:" >> ${JOB}.yaml
    echo "  hz: 5" >> ${JOB}.yaml
    echo "  secs: 4.2" >> ${JOB}.yaml
    echo "  player_traj_n: 11" >> ${JOB}.yaml
    echo "  max_player_move: 4.5" >> ${JOB}.yaml

    echo "model:" >> ${JOB}.yaml
    echo "  embedding_dim: 20" >> ${JOB}.yaml
    echo "  sigmoid: none" >> ${JOB}.yaml
    echo "  mlp_layers: [128, 256, 512]" >> ${JOB}.yaml
    echo "  nhead: 8" >> ${JOB}.yaml
    echo "  dim_feedforward: 2048" >> ${JOB}.yaml
    echo "  num_layers: 6" >> ${JOB}.yaml
    echo "  dropout: 0.0" >> ${JOB}.yaml
    echo "  b2v: False" >> ${JOB}.yaml

else

    echo "  workers: 10" >> ${JOB}.yaml
    echo "  learning_rate: 1.0e-4" >> ${JOB}.yaml

    echo "model:" >> ${JOB}.yaml
    echo "  embedding_dim: 20" >> ${JOB}.yaml
    echo "  sigmoid: none" >> ${JOB}.yaml
    echo "  mlp_layers: [64, 128]" >> ${JOB}.yaml
    echo "  nhead: 4" >> ${JOB}.yaml
    echo "  dim_feedforward: 512" >> ${JOB}.yaml
    echo "  num_layers: 2" >> ${JOB}.yaml
    echo "  dropout: 0.0" >> ${JOB}.yaml
    echo "  b2v: True" >> ${JOB}.yaml

fi

# Save experiment settings.
mkdir -p ${EXPERIMENTS_DIR}/${JOB}
mv ${JOB}.yaml ${EXPERIMENTS_DIR}/${JOB}/

gpu=0
cd ${PROJECT_DIR}
nohup python3 train_baller2vecplusplus.py ${JOB} ${gpu} > ${EXPERIMENTS_DIR}/${JOB}/train.log &
Owner
Michael A. Alcorn
Brute-forcing my way through life.
Michael A. Alcorn
Label data using HuggingFace's transformers and automatically get a prediction service

Label Studio for Hugging Face's Transformers Website • Docs • Twitter • Join Slack Community Transfer learning for NLP models by annotating your textu

Heartex 135 Dec 29, 2022
NeoDays-based tileset for the roguelike CDDA (Cataclysm Dark Days Ahead)

NeoDaysPlus Reduced contrast, expanded, and continuously developed version of the CDDA tileset NeoDays that's being completed with new sprites for mis

0 Nov 12, 2022
To be a next-generation DL-based phenotype prediction from genome mutations.

Sequence -----------+-- 3D_structure -- 3D_module --+ +-- ? | |

Eric Alcaide 18 Jan 11, 2022
SIGIR'22 paper: Axiomatically Regularized Pre-training for Ad hoc Search

Introduction This codebase contains source-code of the Python-based implementation (ARES) of our SIGIR 2022 paper. Chen, Jia, et al. "Axiomatically Re

Jia Chen 17 Nov 09, 2022
Modular and extensible speech recognition library leveraging pytorch-lightning and hydra.

Lightning ASR Modular and extensible speech recognition library leveraging pytorch-lightning and hydra What is Lightning ASR • Installation • Get Star

Soohwan Kim 40 Sep 19, 2022
All the code I wrote for Overwatch-related projects that I still own the rights to.

overwatch_shit.zip This is (eventually) going to contain all the software I wrote during my five-year imprisonment stay playing Overwatch. I'll be add

zkxjzmswkwl 2 Dec 31, 2021
Ray-based parallel data preprocessing for NLP and ML.

Wrangl Ray-based parallel data preprocessing for NLP and ML. pip install wrangl # for latest pip install git+https://github.com/vzhong/wrangl See exa

Victor Zhong 33 Dec 27, 2022
Blender addon - Scrub timeline from viewport with a shortcut

Viewport scrub timeline Move in the timeline directly in viewport and snap to nearest keyframe Note : This standalone feature will be added in the nat

Samuel Bernou 40 Nov 07, 2022
Python library for interactive topic model visualization. Port of the R LDAvis package.

pyLDAvis Python library for interactive topic model visualization. This is a port of the fabulous R package by Carson Sievert and Kenny Shirley. pyLDA

Ben Mabey 1.7k Dec 20, 2022
NLPIR tutorial: pretrain for IR. pre-train on raw textual corpus, fine-tune on MS MARCO Document Ranking

pretrain4ir_tutorial NLPIR tutorial: pretrain for IR. pre-train on raw textual corpus, fine-tune on MS MARCO Document Ranking 用作NLPIR实验室, Pre-training

ZYMa 12 Apr 07, 2022
a test times augmentation toolkit based on paddle2.0.

Patta Image Test Time Augmentation with Paddle2.0! Input | # input batch of images / / /|\ \ \ # apply

AgentMaker 110 Dec 03, 2022
Stanford CoreNLP provides a set of natural language analysis tools written in Java

Stanford CoreNLP Stanford CoreNLP provides a set of natural language analysis tools written in Java. It can take raw human language text input and giv

Stanford NLP 8.8k Jan 07, 2023
Mednlp - Medical natural language parsing and utility library

Medical natural language parsing and utility library A natural language medical

Paul Landes 3 Aug 24, 2022
Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

textgenrnn Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code, or quickly tr

Max Woolf 4.8k Dec 30, 2022
Deep learning for NLP crash course at ABBYY.

Deep NLP Course at ABBYY Deep learning for NLP crash course at ABBYY. Suggested textbook: Neural Network Methods in Natural Language Processing by Yoa

Dan Anastasyev 597 Dec 18, 2022
This is a MD5 password/passphrase brute force tool

CROWES-PASS-CRACK-TOOl This is a MD5 password/passphrase brute force tool How to install: Do 'git clone https://github.com/CROW31/CROWES-PASS-CRACK-TO

9 Mar 02, 2022
Speech Recognition for Uyghur using Speech transformer

Speech Recognition for Uyghur using Speech transformer Training: this model using CTC loss and Cross Entropy loss for training. Download pretrained mo

Uyghur 11 Nov 17, 2022
An assignment on creating a minimalist neural network toolkit for CS11-747

minnn by Graham Neubig, Zhisong Zhang, and Divyansh Kaushik This is an exercise in developing a minimalist neural network toolkit for NLP, part of Car

Graham Neubig 63 Dec 29, 2022
Machine Learning Course Project, IMDB movie review sentiment analysis by lstm, cnn, and transformer

IMDB Sentiment Analysis This is the final project of Machine Learning Courses in Huazhong University of Science and Technology, School of Artificial I

Daniel 0 Dec 27, 2021
Simple multilingual lemmatizer for Python, especially useful for speed and efficiency

Simplemma: a simple multilingual lemmatizer for Python Purpose Lemmatization is the process of grouping together the inflected forms of a word so they

Adrien Barbaresi 70 Dec 29, 2022