Skip to content

Interactive web App to predict anti-COVID19 3C-like protease bioactivity for given SMILES

Notifications You must be signed in to change notification settings

LingjieBao1998/QSAR-COVID-19-App

 
 

Repository files navigation

QSAR-COVID-19-App

Only works on Linux, and Mac

Screen.Recording.2023-07-09.at.10.12.18.AM.mov

Model training

Protein target

The model was built with 133 bioactivity data in the Chembl database in July 2023, with a random forest regression model.

image Image from https://www.ebi.ac.uk/chembl/target_report_card/CHEMBL3927/ image Image from https://www.ebi.ac.uk/chembl/target_report_card/CHEMBL3927/

Model Performance

model = RandomForestRegressor(n_estimators=500, random_state=42)
model.fit(X, Y)
r2 = model.score(X, Y)
r2

gives an r2 = 0.8635050710434334

Predicted vs Experimental based on this model (note this is not an external prediction but all from the Chembl training set)

image

Run the app locally

pre-requirements

  • streamlit
  • sklearn

Java needed

On Linux run

sudo apt install default-jdk

It will go to PATH automatically

On Mac, run

brew install java and we need to put the path into PATH by

echo 'export PATH="/opt/homebrew/opt/openjdk/bin:$PATH"' >> ~/.zshrc
source ~/.zshrc

Then download this repo to your local machine and enter it inside

git clone https://github.com/quantaosun/QSAR-COVID-19-App.git
cd QSAR-COVID-19-App.git

launch the app by

streamlit run app.py

Your browser will open an interface for the APP

Next, you can click Upload a txt file containing the SMILES strings you want to predict, then click Predict will return the result at the bottom, You can name it whatever you like as long as the format is .txt. like,

c1ccccc1 benzene You are advised to predict one molecule at one time for the moment.

What happened after you clicked the predict button

    1. The smiles will be converted into a binary string with 264 bits, the same length as our model expected
    1. The binary string then will be allocated as variable matrix X
    1. The X variable will be fed into our built model, and returns the Y value, which essentially is the pIC50

About

Interactive web App to predict anti-COVID19 3C-like protease bioactivity for given SMILES

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 94.8%
  • Shell 4.4%
  • Procfile 0.8%