Skip to content

SARS-CoV-2 variant calling and consensus assembly pipeline

License

Notifications You must be signed in to change notification settings

audy/sars-cov-2

 
 

Repository files navigation

SARS-CoV-2

Version v0.4.10

Actions Status Actions Status Docker Repository on Quay

SARS-CoV-2 variant calling and consensus assembly pipeline for ARTIC v3 amplicons sequenced on Illumina or Oxford Nanopore platforms

Quick start

docker build -t covid19 .

Run the pipeline in the Docker image:

docker \
  run \
  --rm \
  --workdir /data \
  --volume `pwd`:/data \
  --entrypoint /bin/bash \
  --env prefix=test-covid19 \
  --env reference=reference/nCoV-2019.reference.fasta \
  --env input_fastq=data/twist-target-capture/RNA_control_spike_in_10_6_100k_reads.fastq.gz \
  --env primer_bed_file=reference/artic-v1/ARTIC-V3.bed \
  covid19 \
  jobscript.sh

For Oxford Nanopore:

docker \
  run \
  --rm \
  --workdir /data \
  --volume `pwd`:/data \
  --entrypoint /bin/bash \
  --env prefix=test-covid19 \
  --env INSTRUMENT_VENDOR="Oxford Nanopore" \
  --env reference=reference/nCoV-2019.reference.fasta \
  --env input_fastq=data/twist-target-capture/RNA_control_spike_in_10_6_100k_reads.fastq.gz \
  --env primer_bed_file=reference/artic-v1/ARTIC-V3.bed \
  covid19 \
  jobscript.sh

This currently produces a consensus.fa file, a variants.vcf, a BAM file (covid19.bam), nextstrain results (nextstrain.json) and pangolin results (pangolin.csv).

Development & Testing

To run tests, run pytest.

This repository includes a local requirements.txt file for quickly running some golden output tests across a variety of datasets. This repository is set up to use Github Actions to automatically build the Docker image and run those tests to ensure there are no regressions. These ensure that parameter and pipeline changes don't affect variant calls or consensus sequence generation.

Currently, the following integration tests are run:

  • Simulated Illumina data from the SARS-CoV-2 reference including simulated variants across the genome
  • Example Twist hybrid capture data (Illumina)
  • Example ARTIC v1 amplicon sequencing data (Illumina)

It also uses pre-commit to keep things clean and orderly. To get started, first install the requirements (Python 3 required): pip install -r requirements.txt. Then install the pre-commit hooks: pre-commit install --install-hooks. Note that you'll also need shellcheck installed on your system (brew install shellcheck on a Mac).

Acknowledgments

Many thanks are due across the community, including but not limited to:

About

SARS-CoV-2 variant calling and consensus assembly pipeline

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Jupyter Notebook 57.0%
  • Python 26.8%
  • Shell 12.1%
  • Dockerfile 3.7%
  • Makefile 0.4%