Yoked machine learning utilizes a teacher model to guide a student model. We provide an example pipeline to evaluate yoked learning performance on both classical (part 1) and deep (part 2) machine learning models.
-
- ADME: Pharmaco-kinetics (from tdc.single_pred import ADME)
- CYP2C9 Substrate, Carbon-Mangels et al.
- data = ADME(name = 'CYP2C9_Substrate_CarbonMangels')
- CYP2D6 Substrate, Carbon-Mangels et al.
- data = ADME(name = 'CYP2D6_Substrate_CarbonMangels')
- CYP3A4 Substrate, Carbon-Mangels et al.
- data = ADME(name = 'CYP3A4_Substrate_CarbonMangels')
- HIA (Human Intestinal Absorption), Hou et al.
- data = ADME(name = 'HIA_Hou')
- Pgp (P-glycoprotein) Inhibition, Broccatelli et al.
- data = ADME(name = 'Pgp_Broccatelli')
- Bioavailability, Ma et al.
- data = ADME(name = 'Bioavailability_Ma')
- CYP2C9 Substrate, Carbon-Mangels et al.
- Tox: Toxicity (from tdc.single_pred import Tox)
- hERG blockers, Wang et al.
- data = Tox(name = 'hERG')
- DILI (Drug Induced Liver Injury), Xu et al.
- data = Tox(name = 'DILI')
- Skin Reaction, Alves et al.
- data = Tox(name = 'Skin Reaction')
- Carcinogens, Lagunin, et al.
- data = Tox(name = 'Carcinogens_Lagunin')
- Clintox, Gayvert, et al.
- data = Tox(name = 'ClinTox')
- hERG blockers, Wang et al.
- HTS: High-Throughput Screening (from tdc.single_pred import HTS)
- SARS-CoV-2 3CL Protease, Diamond
- data = HTS(name = 'SARSCoV2_3CLPro_Diamond')
- SARS-CoV-2 3CL Protease, Diamond
- ADME: Pharmaco-kinetics (from tdc.single_pred import ADME)
- Code and functions to evaluate yoked learning with classical machine learning models (random forest, naive bayes and logistic regression).
- yoked_machine_learning_pipeline.py contains functions for evaluating yoked learning
- yoked_learning_main.py contains the main function to run yoked learning
- example boxplot/lineplot.ipynb contains an example notebook that visualize comparisons between yoked learning, active learning, and passive learning
- Code and functions to evaluate yoked learning with deep learning models (MLP).
- Please refer to MolALKit for details about Deep Yoked Learning implementations
- Example implementation after data split:
molalkit_run --data_public bace --metrics roc-auc --learning_type explorative --model_config_selector RandomForest_RDKitNorm_Config \
--split_type scaffold_order --split_sizes 0.5 0.5 --evaluate_stride 100 --seed 0 --save_dir bace_rf_yoked_mlp --n_jobs 4 \
--model_config_evaluators MLP_RDKitNorm_BinaryClassification_Config