forked from satijalab/seurat
-
Notifications
You must be signed in to change notification settings - Fork 0
/
seurat5_merge_vignette.Rmd
109 lines (88 loc) · 3.72 KB
/
seurat5_merge_vignette.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
---
title: "Seurat - Combining Two 10X Runs"
output:
html_document:
theme: united
df_print: kable
pdf_document: default
date: 'Compiled: `r Sys.Date()`'
---
```{r setup, include=TRUE}
all_times <- list() # store the time for each chunk
knitr::knit_hooks$set(time_it = local({
now <- NULL
function(before, options) {
if (before) {
now <<- Sys.time()
} else {
res <- difftime(Sys.time(), now, units = "secs")
all_times[[options$label]] <<- res
}
}
}))
knitr::opts_chunk$set(
tidy = TRUE,
tidy.opts = list(width.cutoff = 95),
message = FALSE,
warning = FALSE,
time_it = TRUE,
error = TRUE
)
```
```{r, include=TRUE}
options(SeuratData.repo.use = "http://satijalab04.nygenome.org")
```
In this vignette, we will combine two 10X PBMC datasets: one containing 4K cells and one containing 8K cells. The datasets can be found [here](https://support.10xgenomics.com/single-cell-gene-expression/datasets).
To start, we read in the data and create two `Seurat` objects.
```{r load_data}
library(Seurat)
pbmc4k.data <- Read10X(data.dir = "../data/pbmc4k/filtered_gene_bc_matrices/GRCh38/")
pbmc4k <- CreateSeuratObject(counts = pbmc4k.data, project = "PBMC4K")
pbmc4k
pbmc8k.data <- Read10X(data.dir = "../data/pbmc8k/filtered_gene_bc_matrices/GRCh38/")
pbmc8k <- CreateSeuratObject(counts = pbmc8k.data, project = "PBMC8K")
pbmc8k
```
# Merging Two `Seurat` Objects
`merge()` merges the raw count matrices of two `Seurat` objects and creates a new `Seurat` object with the resulting combined raw count matrix. To easily tell which original object any particular cell came from, you can set the `add.cell.ids` parameter with an `c(x, y)` vector, which will prepend the given identifier to the beginning of each cell name. The original project ID will remain stored in object meta data under `orig.ident`
```{r merge.objects}
pbmc.combined <- merge(pbmc4k, y = pbmc8k, add.cell.ids = c('4K', '8K'), project = 'PBMC12K')
pbmc.combined
```
```{r inspect.merge}
# notice the cell names now have an added identifier
head(colnames(pbmc.combined))
table(pbmc.combined$orig.ident)
```
# Merging More Than Two `Seurat` Objects
To merge more than two `Seurat` objects, simply pass a vector of multiple `Seurat` objects to the `y` parameter for `merge`; we'll demonstrate this using the 4K and 8K PBMC datasets as well as our previously computed Seurat object from the 2,700 PBMC tutorial (loaded via the [SeuratData](https://github.com/satijalab/seurat-data) package).
```{r merge_three}
library(SeuratData)
InstallData("pbmc3k")
pbmc3k <- LoadData("pbmc3k", type = "pbmc3k.final")
pbmc3k
pbmc.big <- merge(pbmc3k, y = c(pbmc4k, pbmc8k), add.cell.ids = c('3K', '4K', '8K'), project = 'PBMC15K')
pbmc.big
head(colnames(pbmc.big))
tail(colnames(pbmc.big))
unique(sapply(X = strsplit(colnames(pbmc.big), split = '_'), FUN = '[', 1))
table(pbmc.big$orig.ident)
```
# Merge Based on Normalized Data
By default, `merge()` will combine the `Seurat` objects based on the raw count matrices, erasing any previously normalized and scaled data matrices. If you want to merge the normalized data matrices as well as the raw count matrices, simply pass `merge.data = TRUE`. This should be done if the same normalization approach was applied to all objects.
```{r normalize}
pbmc4k <- NormalizeData(pbmc4k)
pbmc8k <- NormalizeData(pbmc8k)
pbmc.normalized <- merge(pbmc4k, y = pbmc8k, add.cell.ids = c('4K', '8K'), project = 'PBMC12K', merge.data = TRUE)
GetAssayData(pbmc.combined)[1:10, 1:15]
GetAssayData(pbmc.normalized)[1:10, 1:15]
```
```{r save.times, include=TRUE}
write.csv(x = t(as.data.frame(all_times)), file = "../output/timings/seurat5_merge_vignette_times.csv")
```
<details>
<summary>**Session Info**</summary>
```{r}
sessionInfo()
```
</details>