Example code that demonstrates how to store, process, and query genomic and biological datasets using Amazon Omics
Amazon Omics helps healthcare and life sciences customers store, query, analyze, and generate insights from genomic and other biological data to improve human health.
This repository contains resources (e.g. code scripts, jupyter notebooks, etc) that demonstrate the usage of Amazon Omics.
The quickest setup to run example notebooks includes:
- An AWS account
- Proper IAM User and Role setup
- An Amazon SageMaker Notebook Instance
- Using Omics Storage with genomics references and readsets: Get acquainted with Omics storage by creating reference and sequence stores, importing data from FASTQ and CRAM files, and downloading readsets.
- Running WDL and Nextflow pipelines with Omics Workflows: Learn how to create, run, and debug WDL and Nextflow based pipelines that process data from Omics Storage and Amazon S3 using Omics Workflows.
- Querying annotations and variants with Omics Analytics: Get started with Omics Analytics by importing variant and annotation data from VCF, TSV, and GFF files, and performing genome scale analysis queries using Amazon Athena.
This library is licensed under the Apache 2.0 License. For more details, please take a look at the LICENSE file.
See the Security issue notifications section of our contributing guidelines for more information.
Although we're extremely excited to receive contributions from the community, we're still working on the best mechanism to take in examples from external sources. Please bear with us in the short-term if pull requests take longer than expected or are closed. Please read our contributing guidelines if you'd like to open an issue or submit a pull request.