Additional poses and SQM scoring results for the Wang 2015 dataset of protein-ligand complexes
Cite as: M. Jalaie, J. Fanfrlík, A. Pecina, M. Lepšík and J. Řezáč; ChemRxiv preprint, 2025. (https://doi.org/10.26434/chemrxiv-2025-38lf5)
The Wang 2015 data set consists of 8 target proteins, each with a series of ligands [1]. Each target contains a single protein structure and multiple ligands. There are two complete set of modeled structures of the P-L complexes of the Wang 2015 dataset, published by Zariquiey et al [2] and Ross et al [3].
The Zariquiey structures are expanded by adding pool of poses obtainned by manual modifications and tamplate-based docking.
Target | Ligands | Protein PDB code |
---|---|---|
BACE | 36 | 4DJX (for Zariquiey) & 4DJW (for Ross set) |
CDK2 | 16 | 1H1Q |
JNK1 | 21 | 2GMX |
MLC1 | 42 | 4HW3 |
p38 | 34 | 3FLY |
PTP1B | 23 | 2QBS |
thrombin | 11 | 2ZFF |
Tyk2 | 16 | 4GIH |
Two top-level directories contain data for derived from the original set of structures published by Zariquiey and Ross. The directory Ross_structures
contains the original poses only. The Zariquiey_structures_extended
directory contains the original poses as well as the pool of poses generated in the present work. Both contains 8 subdirectories named by the protein target. Individual P-L complexes identified by ligand names.
The ligand poses of the expanded set of Zariquiey structures are named accordingly:
- The original pose derived from Zariquiey et al [2] are denoted as
[ligand name]_pose_orig
. - The manually modified poses are denoted as
[ligand name]_pose_manual
. - The first docking pose is named just as
[ligand name]
and the other docking poses are named[ligand name]_pose_[number]
.
Protein structures
The structure of the whole protein target prepared for docking is available as proteined_anealed.pdb
in the subdirectory protein
.
Binding free energies and scores
The free energies and scores are provided as:
- The experimental binding free energies are available in the
experimental_dG.txt
files. - All the SQM2.20' scores are available in the
SQM2.20_all_poses.txt
files. - The selected SQM2.20' score of the best pose, i.e. pose with ithe lowest SQM energy, are summarized in
SQM2.20_selected_poses.txt
files. (The SQM energies of the optimized complexes are available in theSQM_energies.txt
files.) - For the Zariquiey set, the scores obtained with standard scoring functions are available in the
scores
subdirectory.
SQM-optimized structures of the complexes
The structures of the optimized P-L complexes are stored in the subdirectory structures
of each protein target. Each ligand pose has its own subdirectory containing:
ligand.sdf
- The geometry of the ligand in the optimized complex.receptor.pdb
- Model of the active site used in the complex.
The ligand.sdf
and receptor.pdb
represent the geometry on which the SQM2.20 score was calculated. Additionally, the optimized active site has been ported back into the structure of the whole protein and is provided as protein.pdb
.
-
Wang, L.; Wu, Y.; Deng, Y.; Kim, B.; Pierce, L.; Krilov, G.; Lupyan, D.; Robinson, S.; Dahlgren, M. K.; Greenwood, J.; Romero, D. L.; Masse, C.; Knight, J. L.; Steinbrecher, T.; Beuming, T.; Damm, W.; Harder, E.; Sherman, W.; Brewer, M.; Wester, R.; Murcko, M.; Frye, L.; Farid, R.; Lin, T.; Mobley, D. L.; Jorgensen, W. L.; Berne, B. J.; Friesner, R. A.; Abel, R. Accurate and Reliable Prediction of Relative Binding Potency in Prospective Drug Discovery by Way of a Modern Free-Energy Calculation Protocol and Force Field. J. Am. Chem Soc. 2015, 137, 2695–2703. (https://pubs.acs.org/doi/10.1021/ja512751q)
-
Zariquiey, F. S.; Perez, A.; Majewski, M.; Gallicchio, E.; De Fabritiis, G. Validation of the Alchemical Transfer Method for the Estimation of Relative Binding Affinities of Molecular Series. J. Chem. Inf. Model. 2023, 63, 2438–2444. (https://pubs.acs.org/doi/10.1021/acs.jcim.3c00178)
-
Ross, G. A.; Lu, C.; Scarabelli, G.; Albanese, S. K.; Houang, E.; Abel, R.; Harder, E. D.; Wang, L. The maximal and current accuracy of rigorous protein-ligand binding free energy calculations. Commun. Chem. 2023, 6, 222. (https://doi.org/10.1038/s42004-023-01019-9)