Gymnasium environments for the Search Race CodinGame optimization puzzle and Mad Pod Racing CodinGame bot programming game.
search_race_v2_demo.mp4
Action Space | Box([-1, 0], [1, 1], float64) |
Observation Space | Box(-1, 1, shape=(8,), float64) |
import | gymnasium.make("gymnasium_search_race:gymnasium_search_race/SearchRace-v2") |
To install gymnasium-search-race
with pip, execute:
pip install gymnasium_search_race
From source:
git clone https://github.com/Quentin18/gymnasium-search-race
cd gymnasium-search-race/
pip install -e .
The action is a ndarray
with 2 continuous variables:
- The rotation angle between -18 and 18 degrees, normalized between -1 and 1.
- The thrust between 0 and 200, normalized between 0 and 1.
The observation is a ndarray
of 8 continuous variables:
- The x and y coordinates and the angle of the next 2 checkpoints relative to the car.
- The horizontal speed vx and vertical speed vy of the car.
The values are normalized between -1 and 1.
- +1 when a checkpoint is visited.
- 0 otherwise.
The starting state is generated by choosing a random CodinGame test case.
The episode ends if either of the following happens:
- Termination: The car visit all checkpoints before the time is out.
- Truncation: Episode length is greater than 600.
laps
: number of laps. The default value is3
.car_max_thrust
: maximum thrust. The default value is200
.test_id
: test case id to generate the checkpoints (see choices here). The default value isNone
which selects a test case randomly when thereset
method is called.
import gymnasium as gym
gym.make(
"gymnasium_search_race:gymnasium_search_race/SearchRace-v2",
laps=3,
car_max_thrust=200,
test_id=1,
)
- v2: Update observation with relative positions and angles
- v1: Add boolean to indicate if the next checkpoint is the last checkpoint in observation
- v0: Initial version
The SearchRaceDiscrete
environment is similar to the SearchRace
environment except the action space is discrete.
import gymnasium as gym
gym.make(
"gymnasium_search_race:gymnasium_search_race/SearchRaceDiscrete-v2",
laps=3,
car_max_thrust=200,
test_id=1,
)
There are 74 discrete actions corresponding to the combinations of angles from -18 to 18 degrees and thrust 0 and 200.
- v2: Update observation with relative positions and angles
- v1: Add all angles in action space
- v0: Initial version
The MadPodRacing
and MadPodRacingDiscrete
environments can be used to train a runner for
the Mad Pod Racing CodinGame bot programming game.
They are similar to the SearchRace
and SearchRaceDiscrete
environments except the following differences:
- The maps are generated the same way Codingame generates them.
- The car position is rounded and not truncated.
import gymnasium as gym
gym.make("gymnasium_search_race:gymnasium_search_race/MadPodRacing-v1")
gym.make("gymnasium_search_race:gymnasium_search_race/MadPodRacingDiscrete-v1")
mad_pod_racing_v1_demo.mp4
The MadPodRacingBlocker
and MadPodRacingBlockerDiscrete
environments can be used to train a blocker for
the Mad Pod Racing CodinGame bot programming game.
import gymnasium as gym
gym.make("gymnasium_search_race:gymnasium_search_race/MadPodRacingBlocker-v1")
gym.make("gymnasium_search_race:gymnasium_search_race/MadPodRacingBlockerDiscrete-v1")
mad_pod_racing_blocker_v1_demo.mp4
- v1: Update observation with relative positions and angles and update maximum thrust
- v0: Initial version
You can use RL Baselines3 Zoo to train and evaluate agents:
pip install rl_zoo3
The hyperparameters are defined in hyperparams/ppo.yml
.
To train a PPO agent for the Search Race game, execute:
python -m rl_zoo3.train \
--algo ppo \
--env gymnasium_search_race/SearchRaceDiscrete-v2 \
--tensorboard-log logs \
--eval-freq 20000 \
--eval-episodes 10 \
--gym-packages gymnasium_search_race \
--env-kwargs "laps:1000" \
--conf-file hyperparams/ppo.yml \
--progress
For the Mad Pod Racing game, you can add an opponent with the opponent_path
argument:
python -m rl_zoo3.train \
--algo ppo \
--env gymnasium_search_race/MadPodRacingBlockerDiscrete-v1 \
--tensorboard-log logs \
--eval-freq 20000 \
--eval-episodes 10 \
--gym-packages gymnasium_search_race \
--env-kwargs "opponent_path:'rl-trained-agents/ppo/gymnasium_search_race-MadPodRacingDiscrete-v1_1/best_model.zip'" "laps:1000" \
--conf-file hyperparams/ppo.yml \
--progress
To see a trained agent in action on random test cases, execute:
python -m rl_zoo3.enjoy \
--algo ppo \
--env gymnasium_search_race/SearchRaceDiscrete-v2 \
--n-timesteps 1000 \
--deterministic \
--gym-packages gymnasium_search_race \
--load-best \
--progress
To run test cases with a trained agent, execute:
python -m scripts.run_test_cases \
--path rl-trained-agents/ppo/gymnasium_search_race-SearchRaceDiscrete-v2_1/best_model.zip \
--env gymnasium_search_race:gymnasium_search_race/SearchRaceDiscrete-v2 \
--record-video \
--record-metrics
To record a video of a trained agent on Mad Pod Racing, execute:
python -m scripts.record_video \
--path rl-trained-agents/ppo/gymnasium_search_race-MadPodRacingDiscrete-v1_1/best_model.zip \
--env gymnasium_search_race:gymnasium_search_race/MadPodRacingDiscrete-v1
For Mad Pod Racing Blocker, execute:
python -m scripts.record_video \
--path rl-trained-agents/ppo/gymnasium_search_race-MadPodRacingBlockerDiscrete-v1_1/best_model.zip \
--opponent-path rl-trained-agents/ppo/gymnasium_search_race-MadPodRacingDiscrete-v1_1/best_model.zip \
--env gymnasium_search_race:gymnasium_search_race/MadPodRacingBlockerDiscrete-v1
To run tests, execute:
pytest
To cite the repository in publications:
@misc{gymnasium-search-race,
author = {Quentin Deschamps},
title = {Gymnasium Search Race},
year = {2024},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/Quentin18/gymnasium-search-race}},
}
- Gymnasium
- RL Baselines3 Zoo
- Stable Baselines3
- CGSearchRace
- CSB-Runner-Arena
- Coders Strikes Back by Magus