-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathscatools.Rmd
134 lines (98 loc) · 3.47 KB
/
scatools.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
---
title: "scatools"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{scatools}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
# Introduction
We start by loading `scatools` and a filepath to an example `fragments.bed.gz` file containing fragments from 100 normal mammary cells.
Note this vignette is currently set up with dummy normal data and is purely intended to demonstrate package functionality. There are no CNV changes in the test data.
## Installation
You can install the development version of scatools from [GitHub](https://github.com/) with:
``` r
devtools::install_github("mjz1/scatools")
```
## Example
We start by loading `scatools`, as well as the filepath to an example `fragments.bed.gz` file containing scATAC fragments from 100 normal mammary cells.
```{r setup, eval=TRUE, message=FALSE, warning=FALSE}
library(scatools)
library(dittoSeq)
library(ComplexHeatmap)
library(patchwork)
ncores <- 4 # Adjust accordingly
fragment_file <- system.file("extdata", "fragments.bed.gz", package = "scatools")
```
Now we process this data using `scatools`. Helper functions help us to create `GenomicRanges` bins, and compute GC content for downstream usage. Here we demonstrate using 10Mb bins.
**Note**: To generate your own `bins` object, a helper function is provided. Please note that running this function requires installation of a `BSgenome` of your choice. Here we use hg38.
```{r, eval=FALSE}
# if (!require("BiocManager", quietly = TRUE))
# install.packages("BiocManager")
# BiocManager::install("BSgenome.Hsapiens.UCSC.hg38")
bins <- get_tiled_bins(
bs_genome = BSgenome.Hsapiens.UCSC.hg38,
tilewidth = 1e7,
respect_chr_arms = TRUE
)
```
```{r, include=FALSE}
# Sideload bins
bins <- get(data(bins_10mb))
rm(bins_10mb)
```
We load blacklist regions for our genome. These are used to remove fragments found in low mappability or high signal regions as defined [here](https://github.com/Boyle-Lab/Blacklist).
```{r, eval=TRUE}
# Get blacklist regions
blacklist <- get_blacklist(genome = "hg38")
```
Then we run the pipeline:
```{r, eval=FALSE}
sce <- run_scatools(
sample_id = "test_sample",
fragment_file = fragment_file,
blacklist = blacklist,
outdir = "./example/",
verbose = TRUE,
overwrite = TRUE,
ncores = ncores,
bins = bins
)
```
```{r, include=FALSE}
# Sideload the SCE object
sce <- get(data(test_sce))
rm(test_sce)
```
Plot the results using the `dittoSeq` package.
```{r, fig.width = 8, fig.height = 3}
p1 <- dittoDimPlot(sce, var = "clusters")
p2 <- dittoDimPlot(sce, var = "tumor_cell")
p1 + p2 & theme(aspect.ratio = 1)
```
We can visualize the results as a heatmap.
```{r cnaheatmap_plot, message=FALSE, fig.width = 10, fig.height = 5}
sce <- sce[, order(sce$tumor_cell, sce$clusters)]
col_clones <- dittoColors()
col_clones <- col_clones[1:length(unique(sce[["clusters"]]))]
names(col_clones) <- levels(factor(sce[["clusters"]]))
left_annot <- ComplexHeatmap::HeatmapAnnotation(
tumor_cell = sce[["tumor_cell"]],
cnv_cluster = sce[["clusters"]],
col = list(cnv_cluster = col_clones),
which = "row"
)
p_ht_cna <- cnaHeatmap(sce, assay_name = "segment_merged_logratios", clust_annot = left_annot, col_fun = logr_col_fun())
p_ht_cna
```
Or plot individual cells
```{r cell_cna_plot, fig.width = 9, fig.height = 6, warning=FALSE}
# Plot the first five cells
plot_cell_cna(sce = sce, cell_id = 1:5, assay_name = "logr_modal")
```