In this assignment, you will evaluate results from a high-throughput virtual screen against the SARS-CoV2 Spike protein / Human ACE2 interface. There are two parts to this assignment and Part 2 requires completion of Part 1. We recommend reading through both Part 1 and Part 2 before beginning this assignment.
- Part 1 - API and implementation
- Part 2 - Evaluating clustering
The data we are considering comes from Smith and Smith, 2020. In this study, they generated 6 Spike-Ace2 interface poses using MD simulations. They then docked ~10k small molecules against each protein conformation. Provided for you is the top (#1) pose for each ligand docked against one Spike-ACE2 interface conformation, as well as the corresponding SMILES string, AutoDock Vina score, and the “On” bits in the Extended Connectivity Fingerprint for that compound. These can all be found in ligand_information.csv.
Runs all code in align/__main__.py, useful for part 2
python -m algs
Testing is as simple as running
python -m pytest test/*
from the root directory of this project.