The primary theory is based on the series: https://www.coursera.org/specializations/reinforcement-learning
python3 -m venv rl-env
source rl-env/bin/activate
pip install -r requirements.txt
The chapters are in notebooks.
chapter-1-rl-basics.ipynb
chapter-2-k-armed-bandits.ipynb