Skip to content

uaauaguga/metatranscriptome-pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Bacteria noncoding transcripts analysis

  • This pipeline takes paired metagenomic sequencing (MGX) and metatranscriptomic sequencing (MTX) data as input, and annotates intergenic regions with RNA expression
  • Supports paired end / single end data

Preprocess MGX reads

  • Trim adaptor with trimgalore
  • Remove unwanted sequence, eg. host sequence like human genome for gut metagenome data (optional)
  • Run metaphlan for taxonomy profiling
# paired end data
snakemake --snakefile snakefiles/preprocessing.snakefile --configfile config/preprocessing/test-pe.mgx.yaml

# single end data
snakemake --snakefile snakefiles/preprocessing.snakefile --configfile config/preprocessing/test-se.mgx.yaml

Preprocess MTX reads

  • Trim adaptor
  • Remove unwanted sequence (optional)
  • Run metaphlan for taxonomy profiling
  • Infer strandness of the RNA library based on marker gene mapping result
# paired end data
snakemake --snakefile snakefiles/preprocessing.snakefile --configfile config/preprocessing/test-pe.mtx.yaml

# single end data
snakemake --snakefile snakefiles/preprocessing.snakefile --configfile config/preprocessing/test-se.mtx.yaml

Assemble MGX reads and annotate contigs

  • Assemble MGX reads with megahit
  • Run prodigal for gene prediction
  • Annotate predicted gene with Pfam & hmmsearch
  • Search contig for known noncoding RNA with Rfam & cmsearch (optional)
# paired end data
snakemake --snakefile snakefiles/mgx-analysis.snakefile --configfile config/mgx-analysis/test-pe.yaml

# single end data
snakemake --snakefile snakefiles/mgx-analysis.snakefile --configfile config/mgx-analysis/test-se.yaml

Mapping MTX reads

  • Map MTX reads to paired contigs
  • Assemble transcripts with stringtie
  • Annotate transcripts in a gene centric manner
# paired end data
snakemake --snakefile snakefiles/mtx-analysis.with.mgx.snakefile --configfile config/mtx-analysis-with-mgx/test-pe.yaml

# single end data
snakemake --snakefile snakefiles/mtx-analysis.with.mgx.snakefile --configfile config/mtx-analysis-with-mgx/test-se.yaml

Downstream analysis

  • Get transcripts at intergenic regions with distance to nearest CDS >= 16nt
  • Retrieve intergenic regions containing these transcripts
  • Run FragGeneScan on these intergenic regions to predict candidate CDS
  • Run cmsearch on these intergenic regions to annotate known RNA
  • We only consider transcripts that does not overlap with known RNAs and coding regions for downstream analysis

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published