code of the paper 'SCP4SSD: a Serverless Cloud Platform for the prediction of nucleotide Sequence Synthesis Difficulty'
A web-based serverless application predicts DNA synthesis difficulty of any given nucleotide sequences.
Introduction • Installation • Usage & Example • Cite us ❤
This project is inspired from a research paper. And based on their work, we 1) explore more nucleotide sequence features (from 38 --> 426) 2) train a more powerful model (from single RF ---> ensemble learning) 3) adopt more complex feature selection methods (from random selection ---> GA, variance, correlation methods)
- Clone the repo
git clone https://github.com/JustinDoIt/scp4ssd.git
cd scp4ssd
- Create Anaconda Environment
conda env create -f environment.yml
- Activate the environment
conda activate scp4ssd
- Install auto-sklearn
conda install auto-sklearn=0.14.6 -c conda-forge
python predict.py --fasta ./examples/example.fna --out ./examples/example_out.csv
If this repo help you, happy to cite our paper (coming soon...)
Distributed under the MIT License. See LICENSE for more information