pip install git+https://github.com/songlab-cal/gpn.git
- Quick example to play with the model:
basic_example.ipynb
- Application to Arabidopsis thaliana, including training, inference and analysis
- General workflow to create a training dataset given a list of NCBI accessions
Gonzalo Benegas, Sanjit Singh Batra and Yun S. Song "DNA language models are powerful zero-shot predictors of non-coding variant effects" bioRxiv (2022)
DOI: 10.1101/2022.08.22.504706