SelfDrive_AI

Reinforcement learning for self-driving in a 3D simulation (Created using UNITY-3D)

1. Requirements for the SelfDrive_AI Gym

You need Python 3.6 or later to run the simulation. (Note: the current environment is only supported in windows) Also, you can directly interact with the simulation by clicking the exe file and then by using W,A, S and D keys.

Please follow the two links below to install Unity-Gym and Stable-Baselines. Also, you can train it using your custom reinforcement learning algorithms by following the OpenAI gym structure (https://gym.openai.com/).

Install Unity-Gym

Install Stable-Baselines3

mlagents can be installed using pip:

$ python3 -m pip install mlagents

The image below illustrates the target goal of the AIcar, where the car needs to explore all the trajectories to find the bridge first.

2. (Training) You can train the environment by using the code below which has OpenAI gym structure. It will save the training results into a log directory which you can view using tensorboard. Feel free to change the parameters inside the code

from stable_baselines3 import PPO, SAC, ppo
from mlagents_envs.side_channel.engine_configuration_channel import EngineConfigurationChannel
channel = EngineConfigurationChannel()
from gym_unity.envs import UnityToGymWrapper
from mlagents_envs.environment import UnityEnvironment
import time,os
from stable_baselines3.common.vec_env import DummyVecEnv
from stable_baselines3.common.monitor import Monitor
from stable_baselines3.common.policies import ActorCriticPolicy
import math


env_name = "./UnityEnv"
speed = 15


env = UnityEnvironment(env_name,seed=1, side_channels=[channel])
channel.set_configuration_parameters(time_scale =speed)
env= UnityToGymWrapper(env, uint8_visual=False) # OpenAI gym interface created using UNITY

time_int = int(time.time())

# Diretories for storing results 
log_dir = "stable_results/Euler_env_3{}/".format(time_int)
log_dirTF = "stable_results/tensorflow_log_Euler3{}/".format(time_int) 
os.makedirs(log_dir, exist_ok=True)

env = Monitor(env, log_dir, allow_early_resets=True)
env = DummyVecEnv([lambda: env])  # The algorithms require a vectorized environment to run


model = PPO(ActorCriticPolicy, env, verbose=1, tensorboard_log=log_dirTF, device='cuda')
model.learn(int(200000)) # you can change the step size
time_int2 = int(time.time()) 
print('TIME TAKEN for training',time_int-time_int2)
# # save the model
model.save("Env_model")


# # # # # LOAD FOR TESTING
# del model
model = PPO.load("Env_model")

obs = env.reset()

# Test the agent for 1000 steps after training

for i in range(400):
    action, states = model.predict(obs)
    obs, rewards, done, info = env.step(action)
    env.render()

To monitor the training progress using tensorboard you type the following command from the terminal

$ tensorboard --logdir "HERE PUT THE PATH TO THE DIRECTORY"

3. (Testing) The following code can be used to test the trained Humanoid Agent

from stable_baselines3 import PPO, SAC, ppo
from mlagents_envs.side_channel.engine_configuration_channel import EngineConfigurationChannel
channel = EngineConfigurationChannel()
from gym_unity.envs import UnityToGymWrapper
from mlagents_envs.environment import UnityEnvironment
import time,os
from stable_baselines3.common.vec_env import DummyVecEnv
from stable_baselines3.common.monitor import Monitor
from stable_baselines3.common.policies import ActorCriticPolicy
import math


env_name = "./UnityEnv"
speed = 1


env = UnityEnvironment(env_name,seed=1, side_channels=[channel])
channel.set_configuration_parameters(time_scale =speed)
env= UnityToGymWrapper(env, uint8_visual=False) # OpenAI gym interface created using UNITY

time_int = int(time.time())

# Diretories for storing results
log_dir = "stable_results/Euler_env_3{}/".format(time_int)
log_dirTF = "stable_results/tensorflow_log_Euler3{}/".format(time_int)
os.makedirs(log_dir, exist_ok=True)

env = Monitor(env, log_dir, allow_early_resets=True)
env = DummyVecEnv([lambda: env])  # The algorithms require a vectorized environment to run


model = PPO.load("Env_model")

obs = env.reset()

# Test the agent for 1000 steps after training

for i in range(1000):
    action, states = model.predict(obs)
    obs, rewards, done, info = env.step(action)
    env.render()

***Note: I am still developing the project by inducing more challenging constraints.

Reinforcement learning for self-driving in a 3D simulation

Related tags

Overview

SelfDrive_AI

1. Requirements for the SelfDrive_AI Gym

2. (Training) You can train the environment by using the code below which has OpenAI gym structure. It will save the training results into a log directory which you can view using tensorboard. Feel free to change the parameters inside the code

3. (Testing) The following code can be used to test the trained Humanoid Agent

Owner

Surajit Saikia

Reinforcement Learning via Supervised Learning

Predict halo masses from simulations via graph neural networks

ExCon: Explanation-driven Supervised Contrastive Learning

Deep Reinforced Attention Regression for Partial Sketch Based Image Retrieval.

This is an implementation for the CVPR2020 paper "Learning Invariant Representation for Unsupervised Image Restoration"

Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Context Terms

Implementation of "Learning to Match Features with Seeded Graph Matching Network" ICCV2021

OCR Streamlit App is used to extract text from images using python's easyocr, pytorch and streamlit packages

(NeurIPS 2020) Wasserstein Distances for Stereo Disparity Estimation

Codes for paper "KNAS: Green Neural Architecture Search"

Repositório para arquivos sobre o Módulo 1 do curso Top Coders da Let's Code + Safra

Data and code for ICCV 2021 paper Distant Supervision for Scene Graph Generation.

Automatic self-diagnosis program (python required)Automatic self-diagnosis program (python required)

A modular domain adaptation library written in PyTorch.

[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

Interpolation-based reduced-order models

Lenia - Mathematical Life Forms

Official Pytorch implementation for "End2End Occluded Face Recognition by Masking Corrupted Features, TPAMI 2021"

PyTorch-centric library for evaluating and enhancing the robustness of AI technologies

PyTorch Lightning + Hydra. A feature-rich template for rapid, scalable and reproducible ML experimentation with best practices. ⚡🔥⚡