This package contains libraries and scripts for reproducing the results described in Zhou Z, Kearnes S, Li L, Zare RN, Riley P. Optimization of Molecules via Deep Reinforcement Learning; http://arxiv.org/abs/1810.08678.
The main library functions, such as the MDP definition in
chemgraph/mcts/molecules.py
, are of primary interest.
Note that this implementation of the MDP has certain limitations, including (but not limited to):
- No support for modification of aromatic bonds. This includes bonds that are perceived as aromatic during parsing of the initial state.
- No support for multiple atom oxidation states. For example, it is not currently possible to generate CS(=O)C from an empty initial state, since the default oxidation state of sulfur is 2 and the MDP actions are based on the available valence without considering alternate oxidation states.
See the paper for additional details.
Here are the commands to produce the experimental results:
git clone https://github.com/rdkit/rdkit
cp -R ./rdkit/Contrib/SA_Score ./chemgraph/dqn/py
export OUTPUT_DIR="./save"
python optimize_qed.py --model_dir=${OUTPUT_DIR} --hparams="./configs/naive_dqn.json"
python optimize_qed.py --model_dir=${OUTPUT_DIR} --hparams="./configs/bootstrap_dqn_step1.json"
python optimize_qed.py --model_dir=${OUTPUT_DIR} --hparams="./configs/bootstrap_dqn_step2.json"
python optimize_logp.py --model_dir=${OUTPUT_DIR} --hparams="./configs/naive_dqn.json"
python optimize_logp.py --model_dir=${OUTPUT_DIR} --hparams="./configs/bootstrap_dqn_step1.json"
python optimize_logp_of_800_molecules.py --model_dir=${OUTPUT_DIR} --hparams="./configs/naive_dqn_opt_800.json" --similarity_constraint=0.0
python optimize_logp_of_800_molecules.py --model_dir=${OUTPUT_DIR} --hparams="./configs/bootstrap_dqn_opt_800.json" --similarity_constraint=0.0
python multi_obj_opt.py --model_dir=${OUTPUT_DIR} --hparams="./configs/multi_obj_dqn.json" --start_molecule="CCN1c2ccccc2Cc3c(O)ncnc13" --target_molecule="CCN1c2ccccc2Cc3c(O)ncnc13" --similarity_weight=0.0
python target_sas.py --model_dir="${OUTPUT_DIR}" --hparams="./configs/target_sas.json" --start_molecule="CCN1c2ccccc2Cc3c(O)ncnc13" --loss_type="l2" --target_sas=2.5