Skip to content

an R package for repairing, comparing, and visualizing CRISPRs across environmental datasets

License

Notifications You must be signed in to change notification settings

acvill/CRISPRviewR

Repository files navigation

CRISPRviewR: an R package for repairing, comparing, and visualizing CRISPRs across environmental datasets

CRISPR23_oralmicrobiomes_bgwhite_preview5

Background

CRISPRviewR uses the output from minCED to associate, compare, and visualize CRISPR arrays across environmental samples. To get a sense for the shape of minCED data, check out the example files.

ctSkennerton/minced - GitHub

This package relies on the functions of other packages for data cleaning and plotting, including the following:

Installation

Future versions will be available on CRAN or Bioconductor. For now, you can install the development version from GitHub:

devtools::install_github("acvill/CRISPRviewR")

RStudio users may have to run RStudio with administrator privileges for the devtools installation to work.

Example

Please see the CRISPRviewR vignette for a suggested workflow.

Caveat emptor

The CRISPRviewR functions make no assumptions about the completeness of the CRISPR arrays annotated by minCED or the structure of the underlying assembly. In that regard, users of CRISPRviewR should be aware of the following possibilities.

  1. Due to their abundance of direct repeats, CRISPR arrays are often misassembled, particularly in the absence of sufficient coverage.
  2. minCED does not predict orientation of arrays, which requires either identification of cas genes or the annotation of a leader sequence. Therefore, CRISPRviewR plots may be backwards with respect to the direction of transcription. If this is a problem, use a tool like CRISPRleader to get strand orientation, then export your plots and invert manually.
  3. For fragmented assemblies, CRISPR arrays may occur at contig boundaries.
  4. For time-course metagenomic assemblies, differences in CRISPR array structure through time may be attributed to standing variation as opposed to array expansion / recombination.
  5. minCED relies on CRISPR Recognition Tool (CRT) to detect CRISPR repeats. The CRT algorithm requires repeats to be identical, and this stringency can lead to the misassignment of portions of repeat sequences to spacers. Consider setting fix_repeats = TRUE when calling read_minced() to address this issue. See "Fix truncated repeats" in the vignette and my related blog post for more details.

Bugs and notes

  • CRISPRviewR has only been tested with the output from minCED v0.4.2
  • If you find a bug or want to suggest a new feature, please open an issue or make a pull request.

About

an R package for repairing, comparing, and visualizing CRISPRs across environmental datasets

Topics

Resources

License

Stars

Watchers

Forks

Languages