Wrote introduction of README.

leojklarner · Jul 10, 2023 · f427c43 · f427c43
1 parent a4b84d6
commit f427c43
Show file tree

Hide file tree

Showing 4 changed files with 48 additions and 3 deletions.
diff --git a/CITATION.cff b/CITATION.cff
@@ -0,0 +1,16 @@
+
+@InProceedings{pmlr-v202-klarner23a,
+  title = 	 {Drug Discovery under Covariate Shift with Domain-Informed Prior Distributions over Functions},
+  author =       {Klarner, Leo and Rudner, Tim G. J. and Reutlinger, Michael and Schindler, Torsten and Morris, Garrett M and Deane, Charlotte and Teh, Yee Whye},
+  booktitle = 	 {Proceedings of the 40th International Conference on Machine Learning},
+  pages = 	 {17176--17197},
+  year = 	 {2023},
+  editor = 	 {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan},
+  volume = 	 {202},
+  series = 	 {Proceedings of Machine Learning Research},
+  month = 	 {23--29 Jul},
+  publisher =    {PMLR},
+  pdf = 	 {https://proceedings.mlr.press/v202/klarner23a/klarner23a.pdf},
+  url = 	 {https://proceedings.mlr.press/v202/klarner23a.html},
+  abstract = 	 {Accelerating the discovery of novel and more effective therapeutics is an important pharmaceutical problem in which deep learning is playing an increasingly significant role. However, real-world drug discovery tasks are often characterized by a scarcity of labeled data and significant covariate shift—a setting that poses a challenge to standard deep learning methods. In this paper, we present Q-SAVI, a probabilistic model able to address these challenges by encoding explicit prior knowledge of the data-generating process into a prior distribution over functions, presenting researchers with a transparent and probabilistically principled way to encode data-driven modeling preferences. Building on a novel, gold-standard bioactivity dataset that facilitates a meaningful comparison of models in an extrapolative regime, we explore different approaches to induce data shift and construct a challenging evaluation setup. We then demonstrate that using Q-SAVI to integrate contextualized prior knowledge of drug-like chemical space into the modeling process affords substantial gains in predictive accuracy and calibration, outperforming a broad range of state-of-the-art self-supervised pre-training and domain adaptation techniques.}
+}
diff --git a/README.md b/README.md
@@ -1,4 +1,33 @@
-# Q-SAVI
-Code Repository Supplementing the Drug Discovery under Covariate Shift with Domain-Informed Prior Distributions over Functions Paper.
 
-We're currently refactoring the different codebases used in the paper and all source code will be uploaded by the time the paper is presented at ICML.
+![Q-SAVI: Drug Discovery under Covariate Shift with Domain-Informed Prior Distributions over Functions](./images/readme_header.png)
+
+This repository contains an end-to-end pipeline to reproduce and extend the dataset curation, data shift quantification and empricial evaluation presented in the paper:
+
+**_Drug Discovery under Covariate Shift with Domain-Informed Prior Distributions over Functions._** Leo Klarner, Tim G.J. Rudner, Michael Reutlinger, Torsten Schindler, Garrett M. Morris, Charlotte M. Deane, Yee Whye Teh **ICML 2023**.
+
+<p align="center">
+  &#151; <a href="https://proceedings.mlr.press/v202/klarner23a/klarner23a.pdf"><b>View Paper</b></a> &#151;
+</p>
+
+---
+
+**Abstract**: Accelerating the discovery of novel and more effective therapeutics is an important pharmaceutical problem in which deep learning is playing an increasingly significant role. However, real-world drug discovery tasks are often characterized by a scarcity of labeled data and significant covariate shift—a setting that poses a challenge to standard deep learning methods. 
+<img align="right" src="./images/graphical_abstract.png" width="400px"/>
+In this paper, we present Q-SAVI, a probabilistic model able to address these challenges by encoding explicit prior knowledge of the data-generating process into a prior distribution over functions, presenting researchers with a transparent and probabilistically principled way to encode data-driven modeling preferences. Building on a novel, gold-standard bioactivity dataset that facilitates a meaningful comparison of models in an extrapolative regime, we explore different approaches to induce data shift and construct a challenging evaluation setup. We then demonstrate that using Q-SAVI to integrate contextualized prior knowledge of drug-like chemical space into the modeling process affords substantial gains in predictive accuracy and calibration, outperforming a broad range of state-of-the-art self-supervised pre-training and domain adaptation techniques. 
+
+# Citation
+
+If you found our paper or code useful for your research, please consider citing it as:
+
+```
+@InProceedings{klarner2023qsavi,
+  title = {Drug Discovery under Covariate Shift with Domain-Informed Prior Distributions over Functions},
+  author = {Klarner, Leo and Rudner, Tim G. J. and Reutlinger, Michael and Schindler, Torsten and Morris, Garrett M and Deane, Charlotte and Teh, Yee Whye},
+  booktitle = {Proceedings of the 40th International Conference on Machine Learning},
+  pages = {17176--17197},
+  year = {2023},
+  volume = {202},
+  series = {Proceedings of Machine Learning Research},
+  publisher = {PMLR},
+}
+```
diff --git a/images/graphical_abstract.png b/images/graphical_abstract.png
diff --git a/images/readme_header.png b/images/readme_header.png