Lux AI environment interface for RLlib multi-agents

Last update: Nov 07, 2022

Related tags

Deep Learning LuxAI-RLlib

Overview

Lux AI interface to RLlib `MultiAgentsEnv`

For Lux AI Season 1 Kaggle competition.

Please let me know if you use this, I'd like to see what people build with it!

TL;DR

The only thing you need to customise is the interface class (inheriting from multilux.lux_interface.LuxDefaultInterface). The interface needs to:

Implement four "toward-agent" methods:
- observation(joint_observation, actors)
- reward(joint_reward, actors)
- done(joint_done, actors)
- info(joint_info, actors)
Implement one "toward-environment" method:
- actions(action_dict)
Manage its own actor id creation, assignment, etc. (hint citytiles don't have ids in the game engine)

Implementation diagram

Example for training

import numpy as np

# (1) Define your custom interface for (obs, reward, done, info, actions) ---
from multilux.lux_interface import LuxDefaultInterface

class MyInterface(LuxDefaultInterface):
    def observation(self, joint_obs, actors) -> dict:
        return {a: np.array([0, 0]) for a in actors}

    def reward(self, joint_reward, actors) -> dict:
        return {a: 0 for a in actors}

    def done(self, joint_done, actors) -> dict:
        return {a: True for a in actors}

    def info(self, joint_info, actors) -> dict:
        return {a: {} for a in actors}

    def actions(self, action_dict) -> list:
        return []
    
# (2) Register environment --------------------------------------------------
from ray.tune.registry import register_env
from multilux.lux_env import LuxEnv


def env_creator(env_config):
    
    configuration = env_config.get(configuration, {})
    debug = env_config.get(debug, False)
    interface = env_config.get(interface, MyInterface)
    agents = env_config.get(agents, (None, "simple_agent"))
    
    return LuxEnv(configuration, debug,
                     interface=interface,
                     agents=agents,
                     train=True)

register_env("multilux", env_creator)

# (3) Define observation and action spaces for each actor type --------------
from gym import spaces

u_obs_space = spaces.Box(low=0, high=1, shape=(2,), dtype=np.float16)
u_act_space = spaces.Discrete(2)
ct_obs_space = spaces.Box(low=0, high=1, shape=(2,), dtype=np.float16)
ct_act_space = spaces.Discrete(2)

# (4) Instantiate agent ------------------------------------------------------
import random
from ray.rllib.agents import ppo

config = {
    "env_config": {},
    "multiagent": {
        "policies": {
            # the first tuple value is None -> uses default policy
            "unit-1": (None, u_obs_space, u_act_space, {"gamma": 0.85}),
            "unit-2": (None, u_obs_space, u_act_space, {"gamma": 0.99}),
            "citytile": (None, ct_obs_space, ct_act_space, {}),
        },
        "policy_mapping_fn":
            lambda agent_id:
                "citytile"  # Citytiles always have the same policy
                if agent_id.startswith("u_")
                else random.choice(["unit-1", "unit-2"])  # Randomly choose from unit policies
    },
}

trainer = ppo.PPOTrainer(env=LuxEnv, config=config)

# (5) Train away -------------------------------------------------------------
while True:
    print(trainer.train())

See examples/training.py

See also the LuxPythonEnvGym OpenAI-gym port by @glmcdona.

Jaime Ruiz Serra

Lux AI environment interface for RLlib multi-agents

Related tags

Overview

Lux AI interface to RLlib `MultiAgentsEnv`

TL;DR

Implementation diagram

Example for training

Owner

Jaime

[NeurIPS 2021] Low-Rank Subspaces in GANs

Unofficial PyTorch implementation of SimCLR by Google Brain

A Data Annotation Tool for Semantic Segmentation, Object Detection and Lane Line Detection.(In Development Stage)

📚 Papermill is a tool for parameterizing, executing, and analyzing Jupyter Notebooks.

Official PyTorch implementation of "AASIST: Audio Anti-Spoofing using Integrated Spectro-Temporal Graph Attention Networks"

Official PyTorch implementation of paper: Standardized Max Logits: A Simple yet Effective Approach for Identifying Unexpected Road Obstacles in Urban-Scene Segmentation (ICCV 2021 Oral Presentation)

Expressive Power of Invariant and Equivaraint Graph Neural Networks (ICLR 2021)

Relaxed-machines - explorations in neuro-symbolic differentiable interpreters

Meta-Learning Sparse Implicit Neural Representations (NeurIPS 2021)

Combinatorially Hard Games where the levels are procedurally generated

FIRA: Fine-Grained Graph-Based Code Change Representation for Automated Commit Message Generation

MMRazor: a model compression toolkit for model slimming and AutoML

FSL-Mate: A collection of resources for few-shot learning (FSL).

Bachelor's Thesis in Computer Science: Privacy-Preserving Federated Learning Applied to Decentralized Data

A Haskell kernel for IPython.

一套完整的微博舆情分析流程代码，包括微博爬虫、LDA主题分析和情感分析。

Organseg dags - The repository contains the codebase for multi-organ segmentation with directed acyclic graphs (DAGs) in CT.

Scripts and outputs related to the paper Prediction of Adverse Biological Effects of Chemicals Using Knowledge Graph Embeddings.

Official PyTorch implementation for paper "Efficient Two-Stage Detection of Human–Object Interactions with a Novel Unary–Pairwise Transformer"

Materials for upcoming beginner-friendly PyTorch course (work in progress).

Lux AI environment interface for RLlib multi-agents

Related tags

Overview

Lux AI interface to RLlib MultiAgentsEnv

TL;DR

Implementation diagram

Example for training

Owner

Jaime

[NeurIPS 2021] Low-Rank Subspaces in GANs

Unofficial PyTorch implementation of SimCLR by Google Brain

A Data Annotation Tool for Semantic Segmentation, Object Detection and Lane Line Detection.(In Development Stage)

📚 Papermill is a tool for parameterizing, executing, and analyzing Jupyter Notebooks.

Official PyTorch implementation of "AASIST: Audio Anti-Spoofing using Integrated Spectro-Temporal Graph Attention Networks"

Official PyTorch implementation of paper: Standardized Max Logits: A Simple yet Effective Approach for Identifying Unexpected Road Obstacles in Urban-Scene Segmentation (ICCV 2021 Oral Presentation)

Expressive Power of Invariant and Equivaraint Graph Neural Networks (ICLR 2021)

Relaxed-machines - explorations in neuro-symbolic differentiable interpreters

Meta-Learning Sparse Implicit Neural Representations (NeurIPS 2021)

Combinatorially Hard Games where the levels are procedurally generated

FIRA: Fine-Grained Graph-Based Code Change Representation for Automated Commit Message Generation

MMRazor: a model compression toolkit for model slimming and AutoML

FSL-Mate: A collection of resources for few-shot learning (FSL).

Bachelor's Thesis in Computer Science: Privacy-Preserving Federated Learning Applied to Decentralized Data

A Haskell kernel for IPython.

一套完整的微博舆情分析流程代码，包括微博爬虫、LDA主题分析和情感分析。

Organseg dags - The repository contains the codebase for multi-organ segmentation with directed acyclic graphs (DAGs) in CT.

Scripts and outputs related to the paper Prediction of Adverse Biological Effects of Chemicals Using Knowledge Graph Embeddings.

Official PyTorch implementation for paper "Efficient Two-Stage Detection of Human–Object Interactions with a Novel Unary–Pairwise Transformer"

Materials for upcoming beginner-friendly PyTorch course (work in progress).

Lux AI interface to RLlib `MultiAgentsEnv`