This is a repo used to store experiements and a collection of work that I've done during the work of my master thesis. There will be a final repository where all the code for the research is stored. This repo is more a study repo.
- Implement WANDB for garage to monitor progress.
- Train ML10 baselines using garage starting with MAML_TRPO.
- Start working on an off-policy solution. Using context maybe? Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables