Inferring Off-Target effects of drugs on cellular signaling using Interactome-Based deep learning

Github repository of the study:

Inferring Off-Target effects of drugs on cellular signaling using Interactome-Based deep learning
Nikolaos Meimetis¹, Douglas A. Lauffenburger¹, Avlant Nilsson^1,2,3*

Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA

Department of Cell and Molecular Biology, SciLifeLab, Karolinska Institutet, Sweden

Department of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, SE 41296, Sweden

Corresponding author, [email protected]

doi: https://doi.org/10.1016/j.isci.2024.109509

This repository is administered by @NickMeim. For questions contact [email protected]

Trained models of this study are too big to be uploaded here and are available upon reasonable request. Supplementary Data File 1.xlsx and Supplementary Data File 2.xlsx is in the results folder

Ensembles of 50 models are trained for 33 cell lines in the L1000 dataset and are available here:

Many diseases emerge from dysregulated cellular signaling, and drugs are often designed to target specific nodes in cellular networks e.g. signaling proteins, or transcription factors. However, off-target effects are common and may ultimately result in failed clinical trials. Computational modeling of the cell’s transcriptional response to drugs could improve our understanding of their mechanisms of action. Here we develop such an approach based on ensembles of artificial neural networks, that simultaneously infer drug-target interactions and their downstream effects on intracellular signaling. Applied to gene expression data from different cell lines, it outperforms basic machine learning approaches in predicting transcription factors’ activity, while recovering most known drug-target interactions and inferring many new, which we validate in an independent dataset. As a case study, we explore the inferred interactions of the drug Lestaurtinib and its effects on downstream signaling. Beyond its intended target (FLT3) the model predicts an inhibition of CDK2 that enhances downregulation of the cell cycle-critical transcription factor FOXM1, corroborating literature findings. Our approach can therefore enhance our understanding of drug signaling for therapeutic design.

The current repository contains code for:

Initial evaluation of the quality and preprocessing of the data.
Training and fitting of ANN and other models.
Evaluation of the predictions of various models.
Network construction of the MoA of off-target effects of drugs.
Drug-target interaction inference.
Code to re-create the results of the research article.

User case studies

To run your own case study follow the instructions in each folder (there are user friendly scripts explained in the README files of each folder) :

First visit the preprocessing folder.
Then visit the learning folder.
Then visit the postprocessing folder.
Finally visit the MoA folder.

Data

The transcriptomic signatures (level 3 and level 5 profiles) of the L1000 CMap resource¹ are used for this study, together with data from the Bioconductor resource².

The transcriptomic profiles were generated by measuring 978 important (landmark) genes in cancer with a Luminex bead-based assay and computationally inferring the rest¹.

Details on how to access these data can be found in the data folder, but generally the main resources can be accessed in GEO: GSE92742

Folder structure

article_supplementary_info : Folder containing code to re-create the supplementary figures and tables of the article
data : Folder that should contain the retrieved raw data of the study.
figures : Folder containing the scripts to produce the figures of the study.
learning : Folder containing deep learning and machine learning algorithms and models.
preprocessing : Folder containing scripts to pre-process the raw data and evaluate their quality.
- preprocessed_data : Here the pre-processed data to be used in the subsequent analysis are stored.
results : Here the results of a subsequent analysis should be stored. Here you can also find all the inferred interactions in Supplementary Data File 1.xlsx
postprocessing : Folder containing scripts to evaluate models' results and predictions.
MoA : Folder containing code and data to construct the MoA of off-target effects.

Installation

The study utilizes multiple resources from the Python and R programming languages.

Important Note:

This installation has been validated to work in Unix-based, macOS, and WINDOWS operating systems.
For a Linux installation there might be needed some manual installation of external dependencies (especially) for tidyverse. Please check libraries' documentation online
Please note that macOS are not compatible with the GPU components of this installation guide (which are not necessary though!).

Python installation

# After installing anaconda create a conda environment:
conda create -n DTLembas
conda activate DTLembas
conda install -c conda-forge rdkit
conda install -c conda-forge scikit-learn 
pip install networkx
# For general (CPU) pytorch version run the following
# Otherwise for GPU installation run for your own cuda version this command: conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
conda install pytorch torchvision torchaudio -c pytorch
conda install captum -c pytorch

R installation Install R studio, open it, and run:

> if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
> BiocManager::install(c("cmapR","rhdf5","dorothea","org.Hs.eg.db","hgu133a.db"))
> if (!require("tidyverse", quietly = TRUE))
install.packages("tidyverse")
> if (!require("ggplot2", quietly = TRUE))
install.packages("ggplot2")
> install.packages("ggrepel")
> install.packages("ggpubr")
> install.packages("doRNG")
> install.packages("doFuture")

Alternatively, use conda and always use R from the terminal:

conda create -n DTLembas_r_env
conda activate DTLembas_r_env
conda install -c r r-essentials
conda install r-BiocManager
conda install conda-forge::r-ggrepel
conda install r-ggpubr
conda install r-doRNG
conda install r-doFuture
R()
BiocManager::install(c("cmapR","rhdf5","dorothea","org.Hs.eg.db","hgu133a.db"))

R dependencies: You can check the list below and manually install your preferences.

In a quick overview, the following R libraries and versions (although any version of the following libraries is appropriate) were/are used to produce the figures and results of the study:

R version 4.1.2
tidyverse 1.3.1
BiocManager 1.30.16
cmapR 1.4.0
org.Hs.eg.db 3.13.0
rhdf5 2.36.0
doFuture 0.12.0
doRNG 1.8.2
ggplot2 3.3.5
ggpubr 0.4.0
GeneExpressionSignature 1.38.0
caret 6.0-94
ggpubr 0.6.0
ggpattern 1.1.0
ggridges 0.5.4
ggrepel 0.9.3
rstatix 0.7.2
patchwork 1.1.2.9000
dorothea 1.4.2
AnnotationDbi 1.54.1
PharmacoGx 2.4.0
GEOquery 2.60.0
hgu133a.db 3.13.0
limma 3.48.3
affy 1.70.0
dbparser 2.0.1

Python dependencies: First, install conda (anaconda) environment on your computer, and then you can use the commands in a bash-terminal after the list of libraries.

In a quick overview, the following Python libraries and versions (although different versions are POSSIBLY also appropriate) were/are used:

python 3.8.8
seaborn 0.11.2 (version does not matter for this library)
numpy 1.20.3 (version does not matter for this library)
pandas 1.3.5 (version does not matter for this library)
matplotlib 3.5.1 (version does not matter for this library)
scipy 1.7.3
scikit-learn 1.0.2
networkx 2.6.3
rdkit 2021.03.5
captum 0.5.0
pytorch 1.12.0

References

Subramanian, Aravind, et al. "A next generation connectivity map: L1000 platform and the first 1,000,000 profiles." Cell 171.6 (2017): 1437-1452. ↩ ↩²
Gentleman, Robert C., et al. "Bioconductor: open software development for computational biology and bioinformatics." Genome biology 5.10 (2004): 1-16. ↩

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Inferring Off-Target effects of drugs on cellular signaling using Interactome-Based deep learning

User case studies

Data

Folder structure

Installation

References

About

Releases 1

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 95 Commits
MoA		MoA
article_supplementary_info		article_supplementary_info
data		data
figures		figures
learning		learning
postprocessing		postprocessing
preprocessing		preprocessing
results		results
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

License

Lauffenburger-Lab/DrugsANNSignaling

Folders and files

Latest commit

History

Repository files navigation

Inferring Off-Target effects of drugs on cellular signaling using Interactome-Based deep learning

User case studies

Data

Folder structure

Installation

References

Footnotes

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages