Skip to content

Latest commit

 

History

History
151 lines (110 loc) · 7.21 KB

README.md

File metadata and controls

151 lines (110 loc) · 7.21 KB

ELASPIC2

gitlab docs conda pipeline status coverage report

Predicting the effect of mutations on protein folding and protein-protein interaction.

Usage

Web server

ELASPIC2 has been integrated into the original ELASPIC web server, available at: http://elaspic.kimlab.org.

Python API

The following notebooks can be used to explore the basic functionality of ELASPIC2.

Notebook name Google Colab Description
10_stability_demo.ipynb Notebook showing how to use ELASPIC2 to predict the effect of mutations on protein stability.
10_affinity_demo.ipynb Notebook showing how to use ELASPIC2 to predict the effect of mutations on protein binding affinity.
10_multiresidue_demo.ipynb Notebook showing how to use ELASPIC2 to predict the aggregate effect of multiple mutations (has not been validated, use at own risk!).

See other notebooks in the notebooks/ directory for more detailed information about how ELASPIC2 models are trained and validated.

REST API

ELASPIC2 is accessible through a REST API, documented at: http://elaspic.kimlab.org/api/v2/docs.

The following code snippet shows how the REST API can be used from within Python.

import json
import time
import requests

ELASPIC2_JOBS_API = "http://elaspic.kimlab.org/api/v2/jobs/"

mutation_info = {
    "protein_structure_url": "https://files.rcsb.org/download/1MFG.pdb",
    "protein_sequence": (
        "GSMEIRVRVEKDPELGFSISGGVGGRGNPFRPDDDGIFVTRVQPEGPASKLLQPGDKIIQANGYSFINI"
        "EHGQAVSLLKTFQNTVELIIVREVSS"
    ),
    "mutations": "G1A,G1C",
    "ligand_sequence": "EYLGLDVPV",
}

# Submit a job
job_request = requests.post(ELASPIC2_JOBS_API, json=mutation_info).json()
while True:
    # Wait for the job to finish
    time.sleep(10)
    job_status = requests.get(job_request["web_url"]).json()
    if job_status["status"] in ["error", "success"]:
        break
# Collect results
job_result = requests.get(job_status["web_url"]).json()
# Delete job (optional)
requests.delete(job_request["web_url"]).raise_for_status()
# Show results
print(job_result)

Command-line interface (CLI)

Finally, ELASPIC2 can be used through a command-line interface.

python -m elaspic2 \
  --protein-structure tests/structures/1MFG.pdb \
  --protein-sequence GSMEIRVRVEKDPELGFSISGGVGGRGNPFRPDDDGIFVTRVQPEGPASKLLQPGDKIIQANGYSFINIEHGQAVSLLKTFQNTVELIIVREVSS \
  --ligand-sequence EYLGLDVPV \
  --mutations G1A.G1C

Installation

Docker

Docker images that contain ELASPIC2 and all dependencies are available at: https://gitlab.com/elaspic/elaspic2/container_registry.

Conda-pack

Conda-pack tarballs containing ELASPIC2 and all dependencies are available at: http://conda-envs.proteinsolver.org/elaspic2/.

Simply download and extract the tarball into a desired directory and run conda-unpack to unpack.

wget http://conda-envs.proteinsolver.org/elaspic2/elaspic2-latest.tar.gz
mkdir ~/elaspic2
tar -xzf elaspic2-latest.tar.gz -C ~/elaspic2
source ~/elaspic2/bin/activate
conda-unpack

Conda

ELASPIC2 can be installed using conda. However, the torch-geometric dependencies have to be installed separately.

Replace cudatoolkit=10.1 and cu101 with the desired CUDA version.

conda create -n elaspic2 -c pytorch -c ostrokach-forge -c conda-forge -c defaults elaspic2 "cudatoolkit=10.1"
conda activate elaspic2
pip install "torch-scatter==latest+cu101" -f https://pytorch-geometric.com/whl/torch-1.7.0.html
pip install "torch-sparse==latest+cu101" -f https://pytorch-geometric.com/whl/torch-1.7.0.html
pip install "torch-cluster==latest+cu101" -f https://pytorch-geometric.com/whl/torch-1.7.0.html
pip install "torch-spline-conv==latest+cu101" -f https://pytorch-geometric.com/whl/torch-1.7.0.html
pip install "torch-geometric==1.6.1"

Python package index (PyPI)

ELASPIC2 can be installed using pip. However, the torch and torch-geometric dependencies have to be installed from external channels.

Make sure that git lfs is installed on your system, and run the commands below, replace cu101 below with the desired CUDA version.

pip install "torch==1.8.0";
pip install -f https://pytorch-geometric.com/whl/torch-1.8.0+cu101.html --default-timeout=600 \
    "transformers==3.3.1" \
    "torch-scatter==2.0.6" \
    "torch-sparse==0.6.9" \
    "torch-cluster==1.5.9" \
    "torch-spline-conv==1.2.1" \
    "torch-geometric==1.6.1" \
    "https://gitlab.com/kimlab/kmbio/-/archive/v2.1.0/kmbio-v2.1.0.zip" \
    "https://gitlab.com/kimlab/kmtools/-/archive/v0.2.8/kmtools-v0.2.8.zip" \
    "https://gitlab.com/ostrokach/proteinsolver/-/archive/v0.1.25/proteinsolver-v0.1.25.zip" \
    "git+https://gitlab.com/elaspic/elaspic2.git"

Data

Data used to train and validate the ELASPIC2 models are available at http://elaspic2.data.proteinsolver.org and http://protein-folding-energy.data.proteinsolver.org.

See the protein-folding-energy repository to see how these data were generated.

Acknowledgements

References

  • Alexey Strokach, Tian Yu Lu, Philip M. Kim. ELASPIC2 (EL2): Combining contextualized language models and graph neural networks to predict effects of mutations. Journal of Molecular Biology. https://doi.org/10.1016/j.jmb.2021.166810.