rl algorithms with experience replay 1. Double DQN + PER: CartPole-v0, MountainCar-v0 CartPole-v0 MountainCar-v0 2. Double DQN + HER: BitFlip, MountainCar-v0 BitFlip, n = 30 MountainCar-v0 ACER: todo MountainCar-v0