Skip to content

PatWalters/resources_2025

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

Machine Learning in Drug Discovery Resources 2024

Datasets

OpenADMET
The OpenADMET project seeks to proactively characterize the chemical space accessible to ADMET-associated proteins (“anti-targets”). By applying recent advances in experimental and computational techniques, a comprehensive open library of experimental and structural datasets will be generated.

AIRCHECK
AIRCHECK is a platform that provides access to a large collection of high-quality datasets for drug discovery and development. The datasets are curated from various sources and are available in a standardized format. The current focus appears to be on DNA encoded library (DEL) data.

Polaris
Polaris aims is to improve the state of benchmarking so ML can have a greater impact on real-world drug discovery scenarios. To start, Polaris hopes to provide a single source of truth that aggregates and provides simple access to datasets & benchmarks.

PLINDER
PLINDER is an academic-industry collaboration to address this, driven by VantAI, NVIDIA, the Computational Structural Biology group at the University of Basel & SIB Swiss Institute of Bioinformatics - co-organizers of CASP, and MIT. PLINDER aims to provide a gold standard dataset and evaluations to push the field of computational protein-ligand interactions prediction forward.

Blogs

Charlie’s Substack
Charlie Harris writes about applications of AI in drug discovery. Most recently, his posts have focused on efforts to reproduce AlphaFold3.

Practical Cheminformatics
This is a blog where I post once a month or so. These posts typically contain code that demonstrates various aspects of cheminformatics; clustering, machine learning, data visualization, etc. I occasionally throw in posts containing opinions on things like AI and getting a job.

Is Life Worth Living
A great blog from Iwatobipen (aka pen), whose posts are chock full of great code examples. Pen always seems to be up on the latest methods and posts interesting examples on a variety of topics ranging from quantum chemistry to machine learning.

The RDKit Blog
Greg Landrum is the primary contributor to, and BDFL, of the RDKit. In addition to the latest and greatest features in the RDKit, Greg's posts also touch on a number of key issues in Cheminformatics, such as dealing with unbalanced datasets and the impact of fingerprint folding on similarity searching.

Tutorials

Practical Cheminformatics Tutorials
This is a collection of Jupyter notebooks that I put together to demonstrate various aspects of cheminformatics and machine learning. The notebooks demonstrate a range of topics from cheminformatics basics to more advanced machine learning. The tutorials all use open source software and can run on Google Colab without installing software locally. .

TeachOpenCADD
A great set of tutorials from Andrea Volkamer's group that use Open Source software to teach Computer-Aided Drug Design concepts including molecular similarity, applications of machine learning, and pharmacophore analysis.

The RDKit Cookbook
A terrific resource that provides "recipes" for a number of common tasks.

About

Machine Learning in Drug Discovery Resources 2024

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published