Skip to content

Latest commit

 

History

History
77 lines (59 loc) · 3.09 KB

README.md

File metadata and controls

77 lines (59 loc) · 3.09 KB

QSAR-COVID-19-App

Only works on Linux, and Mac

Screen.Recording.2023-07-09.at.10.12.18.AM.mov

Model training

Protein target

The model was built with 133 bioactivity data in the Chembl database in July 2023, with a random forest regression model.

image Image from https://www.ebi.ac.uk/chembl/target_report_card/CHEMBL3927/ image Image from https://www.ebi.ac.uk/chembl/target_report_card/CHEMBL3927/

Model Performance

model = RandomForestRegressor(n_estimators=500, random_state=42)
model.fit(X, Y)
r2 = model.score(X, Y)
r2

gives an r2 = 0.8635050710434334

Predicted vs Experimental based on this model (note this is not an external prediction but all from the Chembl training set)

image

Run the app locally

pre-requirements

  • streamlit
  • sklearn

Java needed

On Linux run

sudo apt install default-jdk

It will go to PATH automatically

On Mac, run

brew install java and we need to put the path into PATH by

echo 'export PATH="/opt/homebrew/opt/openjdk/bin:$PATH"' >> ~/.zshrc
source ~/.zshrc

Then download this repo to your local machine and enter it inside

git clone https://github.com/quantaosun/QSAR-COVID-19-App.git
cd QSAR-COVID-19-App.git

launch the app by

streamlit run app.py

Your browser will open an interface for the APP

Next, you can click Upload a txt file containing the SMILES strings you want to predict, then click Predict will return the result at the bottom, You can name it whatever you like as long as the format is .txt. like,

c1ccccc1 benzene You are advised to predict one molecule at one time for the moment.

What happened after you clicked the predict button

    1. The smiles will be converted into a binary string with 264 bits, the same length as our model expected
    1. The binary string then will be allocated as variable matrix X
    1. The X variable will be fed into our built model, and returns the Y value, which essentially is the pIC50