A Nextflow pipeline used for quantification and quanlity control for single-end RNA-seq data of C. elegans.
Clone this pipeline use command below to quest:
git clone https://github.com/gaotian52/SEmRNA-seq-nf.git
and cd SEmRNA-seq-nf
nextflow kallisto_SE.nf --fqs=test.tsv
- --fqs
We use a sample sheet as the input of sequences here, see the example in test.tsv
.
Each column represent strain_name
sample_name
raw_FASTQ
Note that sample_name
should be unique for each sequence.
- --ref
Reference genome. Default = c_elegans.PRJNA13758.WS276.genomic.fa.gz
- --vcf
Variant Call Format (VCF) file. Default = WI.20200815.hard-filter.vcf.gz
- --gff3
GFF3 file. Default = c_elegans.PRJNA13758.WS276.annotations.gff3.gz
- --out
Used to specify the output directory. Default = "RNAseq_SE-${date}"
- --fragment_len
Estimated average fragment length. see the kallisto document for details. Default = "70"
- --fragment_sd
Estimated standard deviation of fragment length. see the kallisto document for details. Default = "50"
- --bootstrap
Number of bootstrap samples used by kallisto. Default = "100"
This pipeline will generate two folders, kallisto
and multiqc_report
in your working directory.
kallisto/
: RNA-Seq mapping results
multiqc_report/
:
├── multiqc_pre_trim_fastqc.html # Summary of FastQC results on raw FASTQ files
├── multiqc_post_trim_fastqc.html # Summary of FastQC results on FASTQ files trimmed by fastp
└── multiqc_kallisto.html # Summary of kallisto log