Multi-Objective Reinforced Active Learning

Dependencies

wandb
tqdm
pytorch >= 1.7.0
numpy >= 1.20.0
scipy >= 1.1.0
pycolab == 1.2

Weights and Biases

Our code depends on for visualizing and logging results during training. As a result, we call wandb.init(), which will prompt to add an API key for linking the training runs with your personal wandb account. This can be done by pasting the WANDB_API_KEY into the respective box when running the code for the first time.

Environments

Our gridworlds (Emergency: randomized_v2.py, Delivery: randomized_v3.py) build on the game engine with a custom wrapper to provide similar functionality as the gym . This engine comes with a user interface and any environment can be played in the console using python environment.py with arrow keys and w, a, s, d as controls.

Training

There are four training scripts for

manually training a PPO agent on custom rewards (ppo_train.py),
training AIRL on a single expert dataset (airl_train.py),
active MORL with custom/automatic preferences (moral_train.py) and
training DRLHP with custom/automatic preferences (drlhp_train.py).

When using automatic preferences, a desired ratio can be passed as an argument. For example,

python moral_train.py --ratio a b c

will run MORAL using a (real-valued) ratio of a:b:c among the three explicit objectives in Delivery.

Hyperparameters

Hyperparameters are passed as arguments to wandb.init() and can be changed by modifying the respective training files.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
drlhp		drlhp
envs		envs
moral		moral
saved_models		saved_models
utils		utils
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

drlhp

drlhp

envs

envs

moral

moral

saved_models

saved_models

utils

utils

README.md

README.md

Repository files navigation

Multi-Objective Reinforced Active Learning

Dependencies

Weights and Biases

Environments

Training

Hyperparameters

About

Releases

Packages

Languages

mlpeschl/moral_rl

Folders and files

Latest commit

History

Repository files navigation

Multi-Objective Reinforced Active Learning

Dependencies

Weights and Biases

Environments

Training

Hyperparameters

About

Resources

Stars

Watchers

Forks

Languages