Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
config		config
diff_expression		diff_expression
excerpt_pipeline_run		excerpt_pipeline_run
expression_plotting		expression_plotting
mapping_rates		mapping_rates
mds		mds
pca		pca
prepare_counts		prepare_counts
renv		renv
rna_species_composition		rna_species_composition
smrnaseq_pipeline_run		smrnaseq_pipeline_run
test_fastq		test_fastq
.Rprofile		.Rprofile
README.md		README.md
master.Rmd		master.Rmd
renv.lock		renv.lock

Repository files navigation

smncRNA analysis template

Analysis template for analysing, visualising and communicating the findings of Illumina Small RNA sequencing data with a focus on small non-coding RNA's that are sometimes forgotten (ie. looking at and beyond miRNA's). This is set up to analyse human data, but can be adapted to other species with some tweaks to the code, namely the pipeline inputs. Think of this as a collection of scripts that will require some familiarity with UNIX/R rather than a fully automated workflow.

What this template can do

This template uses open source tools and includes several scripts for researchers to analyse, explore and communicate findings to other researchers through interactive tables and plots. The results of this template can be served as a github page that renders html files and provides links to RShiny apps hosted on shinyapps.io - this means a single weblink can be given to your collaborators to provide them with all your analysis code and results. Most importantly, this template puts you in good steed to ensure your analysis is reproducible!

What this template can't do

Tell you what analysis tools and parameters are appropriate for your data or research question, the assumption is that the tools this template uses are tools you've intentionally chosen to use and that you will actively adapt this template for your use-case
Account for different operating systems and compute infrastructures - this means there likely be some UNIX experience required to run the pipelines/scripts on your operating system or job scheduler. I won't tell you how to do this here, but the pipelines and tools used here are generally portable (ie. able to be run on different operating systems) and I've used renv environments to make the R code more portable
The whole analysis isn't automated because it probably shouldn't be

What's this template gonna do?

This template will guide you through processing the data through two pipelines:

Both these pipelines undertake preprocessing, filtering, alignment, and reporting. These pipelines output counts of miRNA's (in the case of the first pipeline - smrnaseq) and other RNA's such as miRNA's, tRNA's, piRNA's, circRNA's etc. (in the case of the second pipeline - exceRpt).

Beyond the QC the pipelines undertake, additional QC is undertaken to summarise the read counts and mapping rates of the data.

These counts datasets output by the pipelines are analysed in R to undertake a differential expression analysis of all these RNA species to find differently expressed RNA's. Two methods were employed to undertake a differential expression analysis, namely limma/voom and deseq2.

Beyond a traditional differential expression analysis, the data is prepared and presented in an interactive RShiny app that allows the user to explore RNA expression (both raw counts and counts per million).

Interactive MDS and PCA plots are also created to explore clusters of RNA's/samples in the data.

Lastly, the composition of the RNA species are explored.

Testing

This template has been validated to work on:

Test fastq data available in the test_fastq directory

How to use this template

1. Fork the template repo to a personal or lab account

See here for help

2. Take this template to the data on your local machine

Clone the forked smncrna_analysis_template repo to the machine you'd like to analyse the data on

git clone https://github.com/leahkemp/smncrna_analysis_template.git

3. Format your input files

Fastq naming convention

sample_S*_R1.fastq.gz

one fastq file per sample

For example see the test fastq files here

Metadata file

Required columns:

"sample"
- must be titled "sample"
- must contain a row with a unique sample name/id for each fastq file present in the directory of fastq files to be analysed
"treatment"
- must be titled "treatment"

Other notes:

you can have additional columns
you can't have any duplicate column names eg. two columns named "sample" and "Sample"
make sure every sample in the fastq directory to be analysed is included in the metadata file and is associated with a treatment group

For example see the test metadata file here

Configuration file

Set up ./config/config.yaml

For example see the test configuration file here

4. Analyse your data

Run/work through the master RMarkdown file, this will do the bulk of the analyses and generate several html file for data visualisation and csv files with processed data

5. Commit and push to your forked version of the github repo

Push all the results files you're comfortable with being online:

To maintain reproducibility of your analysis, commit and push:

All configuration files
All run scripts
All your documentation/notes

6. Repeat step 5 each time you re-run the analysis with different parameters

7. Create a github page (optional)

8. Contribute back!

Raise issues in the issues page
Create feature requests in the issues page
Contribute your code! Create your own branch from the development branch and create a pull request to the development branch once the code is on point!

Contributions and feedback are always welcome! 😊

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

smncRNA analysis template

Table of contents

What this template can do

What this template can't do

What's this template gonna do?

Testing

How to use this template

1. Fork the template repo to a personal or lab account

2. Take this template to the data on your local machine

3. Format your input files

Fastq naming convention

Metadata file

Configuration file

4. Analyse your data

5. Commit and push to your forked version of the github repo

6. Repeat step 5 each time you re-run the analysis with different parameters

7. Create a github page (optional)

8. Contribute back!

About

Releases 15

Packages

Languages

leahkemp/smncrna_analysis_template

Folders and files

Latest commit

History

Repository files navigation

smncRNA analysis template

Table of contents

What this template can do

What this template can't do

What's this template gonna do?

Testing

How to use this template

1. Fork the template repo to a personal or lab account

2. Take this template to the data on your local machine

3. Format your input files

Fastq naming convention

Metadata file

Configuration file

4. Analyse your data

5. Commit and push to your forked version of the github repo

6. Repeat step 5 each time you re-run the analysis with different parameters

7. Create a github page (optional)

8. Contribute back!

About

Topics

Resources

Stars

Watchers

Forks

Releases 15

Packages 0

Languages

Packages