This readme describes how to run several control optimization and co-design experiments and visualize the results.
To set the parameters of group ppo, you can edit run_group_ppo.py
and change the following:
experiment_name
= all experiment files are saved tosaved_data/experiment_name
Create SimJob
s to specify which robots to train and in which environments to train them. SimJob
s are parameterized by the following:
name
= all job files are saved tosaved_data/experiment_name/name
robots
= array of robot names specifying which robots to train. Robot files must be of type.json
(created using the EvoGym Design Tool) or.npz
(saved from another experiment) and must be located inexamples/world_data
envs
= array of environment names in which to train robotstrain_iters
= number of iterations of ppo to train each robot's controller
Each robot in robots
will be trained in each environment in envs
for train_iters
iterations of ppo.
From within example
, you can run group ppo with the following command:
python run_group_ppo.py --algo ppo --use-gae --lr 2.5e-4 --clip-param 0.1 --value-loss-coef 0.5 --num-processes 4 --num-steps 128 --num-mini-batch 4 --log-interval 100 --use-linear-lr-decay --entropy-coef 0.01 --eval-interval 50
All ppo hyperparameters are specified through command line arguments. For more details please see this repo.
To set the parameters of the genetic algorithm, you can edit run_ga.py
and change the following:
seed
= seed to control randomnesspop_size
= the algorithm evolves robots in populations of this sizestructure_shape
= each robot is represented by(m,n)
matrix of voxelsexperiment_name
= all experiment files are saved tosaved_data/experiment_name
max_evaluations
= maximum number of unique robots to evaluatetrain_iters
= number of iterations of ppo to train each robot's controllernum_cores
= number of robots to train in parallel. Note: the total number of processes created will benum_cores * num_processes
(as specified below in the command line)
From within example
, you can run the genetic algorithm with the following command:
python run_ga.py --env-name "Walker-v0" --algo ppo --use-gae --lr 2.5e-4 --clip-param 0.1 --value-loss-coef 0.5 --num-processes 4 --num-steps 128 --num-mini-batch 4 --log-interval 100 --use-linear-lr-decay --entropy-coef 0.01 --eval-interval 50
The environment name as well as all ppo hyperparameters are specified through command line arguments. For more details please see this repo.
To set the parameters of bayesian optimization, you can edit run_bo.py
and change the following:
seed
= seed to control randomnesspop_size
= the algorithm evolves robots in populations of this sizestructure_shape
= each robot is represented by(m,n)
matrix of voxelsexperiment_name
= all experiment files are saved tosaved_data/experiment_name
max_evaluations
= maximum number of unique robots to evaluate. Should be a multiple ofpop_size
train_iters
= number of iterations of ppo to train each robot's controllernum_cores
= number of robots to train in parallel. Note: the total number of processes created will benum_cores * num_processes
(as specified below in the command line)
From within example
, you can run bayesian optimization with the following command:
python run_bo.py --env-name "Walker-v0" --algo ppo --use-gae --lr 2.5e-4 --clip-param 0.1 --value-loss-coef 0.5 --num-processes 4 --num-steps 128 --num-mini-batch 4 --log-interval 100 --use-linear-lr-decay --entropy-coef 0.01 --eval-interval 50
The environment name as well as all ppo hyperparameters are specified through command line arguments. For more details please see this repo.
To set the parameters of cppn-neat, you can edit run_cppn_neat.py
and change the following:
seed
= seed to control randomnesspop_size
= the algorithm evolves robots in populations of this sizestructure_shape
= each robot is represented by(m,n)
matrix of voxelsexperiment_name
= all experiment files are saved tosaved_data/experiment_name
max_evaluations
= maximum number of unique robots to evaluate. Should be a multiple ofpop_size
train_iters
= number of iterations of ppo to train each robot's controllernum_cores
= number of robots to train in parallel. Note: the total number of processes created will benum_cores * num_processes
(as specified below in the command line)
From within example
, you can run cppn-neat with the following command:
python run_cppn_neat.py --env-name "Walker-v0" --algo ppo --use-gae --lr 2.5e-4 --clip-param 0.1 --value-loss-coef 0.5 --num-processes 4 --num-steps 128 --num-mini-batch 4 --log-interval 100 --use-linear-lr-decay --entropy-coef 0.01 --eval-interval 50
The environment name as well as all ppo hyperparameters are specified through command line arguments. For more details please see this repo.
From within example
, you can visualize the results from any co-design experiment with the following command:
python visualize.py --env-name "Walker-v0"
Use the appropriate environment name and follow the on-screen instructions.
To visualize the results of a group ppo experiment you can use the same command -- the environment name is no longer necessary.
To set the parameters of the gif generating script, you can edit make_gifs.py
and change the following:
GIF_RESOLUTION
= resolution of produced gifsNUM_PROC
= number of gifs to produce in parallelname
= all files are saved tosaved_data/all_media/name
experiment_names
,env_names
= arrays of experiments their corresponding environments to generate gifs forload_dir
= directory where experiments are storedgenerations
= array of generation numbers to use in the gif generation process. The default behavior is to use all of themranks
= which robots to use in the gif generation process. Robots are given a rank from1
ton
based on their reward during training. The default behavior is to use all robots.organize_by_experiment
= flag specifying whether or not to organize gifs into a separate folders for each experiment if multiple experiments are specified inexperiment_names
organize_by_generation
= flag specifying whether or not to organize gifs into separate folders for each generation
The script runs without any command line arguments.
my_job = Job(
name = 'ga_walking_experiment_gifs',
experiment_names= ['ga_walking_experiment'],
env_names = ['Walker-v0'],
load_dir = exp_root,
ranks = [i for i in range(5)]
organize_by_generation=True,
)
In this example, gifs are generated for the experiment ga_walking_experiment
which contains robots trained in the Walker-v0
environment. Gifs for the top 5
robots in each generation i
will be generated and saved to saved_data/all_media/ga_walking_experiment_gifs/generation_i
.