-
Notifications
You must be signed in to change notification settings - Fork 2
/
Scan_tutorial.Rmd
421 lines (318 loc) · 39.2 KB
/
Scan_tutorial.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
---
title: "Tutorial for scanning sample-specific miRNA regulation from bulk and single-cell RNA-sequencing data"
author: "\\
Junpeng Zhang ([email protected])\\
School of Engineering, Dali University"
date: '`r Sys.Date()`'
output:
BiocStyle::html_document:
toc: yes
BiocStyle::pdf_document:
toc: yes
vignette: >
%\VignetteIndexEntry{Tutorial for scanning sample-specific miRNA regulation from bulk and single-cell RNA-sequencing data}
%\VignetteEngine{knitr::rmarkdown}
%\usepackage[utf8]{inputenc}
%\VignetteEncoding{UTF-8}
---
```{r style, echo=FALSE, results="asis", message=FALSE}
BiocStyle::markdown()
knitr::opts_chunk$set(tidy = FALSE,
warning = FALSE,
message = FALSE)
```
# Introduction
To model the dynamic regulatory processes of miRNAs at the single-sample level, we implement a Sample-specific miRNA regulation (Scan) framework to scan sample-specific miRNA regulation from bulk and single-cell RNA-sequencing data. In this tutorial, we will show how to apply Scan into new data.
# Load required R packages
Scan provides 27 network inference methods for constructing miRNA-mRNA relation matrix, including Pearson [1], Spearman [2], Kendall [3], Distance correlation (Dcor) [4], Random Dependence Coefficient (RDC) [5], Hoeffding's D statistics (Hoeffding) [6], Z-score [7], Biweight midcorrelation (Bcor) [8], Weighted rank correlation (Wcor) [9], Cosine [10], Euclidean [11], Manhattan [12], Canberra [13], Chebyshev [14], Dice [15], Jaccard [16], Mahalanobise [17], Mutual Information (MI) [18], Maximal Information Coefficient (MIC) [19], Lasso [20], Elastic [20], Ridge [20], GenMiR++ [21], (Phit) [22], (Phis) [22], (Rhop) [22], and Intervention calculus when the Directed acyclic graph is Absent (IDA) [23]. Except for GenMiR++ implemented with Matlab, the other 26 network inference methods are implemented with R. Before applying Scan into new data, users need to install and load the the following 16 R packages.
```{r, eval=TRUE, include=TRUE}
# Load required R packages
library(pracma)
library(WGCNA)
library(igraph)
library(energy)
library(Hmisc)
library(parmigene)
library(minerva)
library(glmnet)
library(pcalg)
library(doParallel)
library(philentropy)
library(StatMatch)
# The propr R package can be obtained from https://github.com/tpq/propr
library(propr)
library(gtools)
library(pbapply)
library(pcaPP)
```
# Data preparation
For scanning sample-specific miRNA regulation, users should prepare matched miRNA and mRNA expression data and putative miRNA-target interactions (optional). In this tutorial, we use K562 single-cell RNA-sequencing data (the expression data of 2822 miRNAs and 21,704 mRNAs in 19 half K562 cells) as an example. Putative miRNA-target interactions are from TargetScan v8.0 [24] and ENCORI [25] (the pilot version is starBase). From TargetScan, a list of 235,109 predicted miRNA-mRNA interactions has been obtained. A list of 55,343 high-confidence miRNA-mRNA interactions is obtained from ENCORI.
```{r, eval=TRUE, include=TRUE}
# Load K562 dataset
load("Data/K562_19_single-cell_matched_miR_mR.RData")
## Preprocess the single-cell sequencing data including log2(x+1), compute the average expression values of duplicate genes and remove genes with constant expression values in all cells
# Transformation using log2(x+1)
miRNA_scRNA_norm <- log2(miRNA_scRNA_raw+1)
mRNA_scRNA_norm <- log2(mRNA_scRNA_raw+1)
# Compute the average expression values of duplicate genes
source("R/Scan.interp.R")
miRNA_scRNA_norm_average <- Averg_Duplicate(miRNA_scRNA_norm)
mRNA_scRNA_norm_average <- Averg_Duplicate(mRNA_scRNA_norm)
# Remove genes with zero expression values in all cells
miRNA_scRNA_norm_mean <- unlist(lapply(seq(dim(miRNA_scRNA_norm_average)[2]), function(i) mean(miRNA_scRNA_norm_average[, i])))
miRNA_scRNA_norm_zero <- miRNA_scRNA_norm_average[, which(miRNA_scRNA_norm_mean > 0)]
mRNA_scRNA_norm_mean <- unlist(lapply(seq(dim(mRNA_scRNA_norm_average)[2]), function(i) mean(mRNA_scRNA_norm_average[, i])))
mRNA_scRNA_norm_zero <- mRNA_scRNA_norm_average[, which(mRNA_scRNA_norm_mean > 0)]
# Reserve genes with higher mean expression values in all cells
miRNA_scRNA_norm_mean_update <- unlist(lapply(seq(dim(miRNA_scRNA_norm_zero)[2]), function(i) mean(miRNA_scRNA_norm_zero[, i])))
miRNA_scRNA_norm_filter <- miRNA_scRNA_norm_zero[, which(miRNA_scRNA_norm_mean_update > median(miRNA_scRNA_norm_mean_update))]
mRNA_scRNA_norm_mean_update <- unlist(lapply(seq(dim(mRNA_scRNA_norm_zero)[2]), function(i) mean(mRNA_scRNA_norm_zero[, i])))
mRNA_scRNA_norm_filter <- mRNA_scRNA_norm_zero[, which(mRNA_scRNA_norm_mean_update > median(mRNA_scRNA_norm_mean_update))]
# Load prior information
ENCORI <- read.csv("Data/ENCORI.csv", header = TRUE, sep = ",")
TargetScan <- read.csv("Data/TargetScan_8.0.csv", header = TRUE, sep = ",")
ENCORI_graph <-make_graph(c(t(ENCORI)), directed = FALSE)
TargetScan_graph <-make_graph(c(t(TargetScan)), directed = FALSE)
```
For convenience, the full list of our prepared bulk and single-cell transcriptomics datasets in the Scan paper can be obtained from [here](https://drive.google.com/file/d/1MgLNYcALNi4nR4S9MiYTGyUCekbGwM_k/view?usp=drive_link).
# Predicting sample-specific miRNA-mRNA regulatory networks
The utility functions for scanning sample-specific miRNA regulation are collected in two source files: **Scan.interp.R** (using a linear interpolation strategy) and **Scan.perturb.R** (using a statistical perturbation strategy). For a large-scale dataset (e.g. the number of samples is more than 100), we recommend users selecting the network inference methods with better efficiency or higher scalability (i.e. less runtime). For example, in our work, as for the linear interpolation strategy (Scan.interp), the runtime of 7 out of 27 network inference methods (Pearson, Z-score, Bcor, Wcor, Phit, Phis and Rhop) is less than an hour for both K562 and BRCA datasets, and have a good efficiency or scalability. In addition, for the statistical perturbation strategy (Scan.perturb), the runtime of 11 out of 27 network inference methods (Pearson, Z-score, Bcor, Wcor, Euclidean, Manhattan, Canberra, Chebyshev, Phit, Phis and Rhop) is less than an hour in both K562 and BRCA datasets, indicating a good efficiency or scalability too.
In this tutorial, we select five representative network inference methods (Pearson, Euclidean, MI, Lasso, Phit) spanning five types (Correlation, Distance, Information, Regression and Proportionality) and two strategies (Scan.interp and Scan.perturb) to infer cell-specific miRNA regulation from small-scale K562 single-cell RNA-sequencing data. The prior information of miRNA targets has three cases: None (no prior information), TargetScan (prior information of TargetScan), ENCORI (prior information of ENCORI).
```{r, eval=TRUE, include=TRUE}
# No prior information with Scan.interp
source("R/Scan.interp.R")
Scan.interp_Pearson_timestart <- Sys.time()
Scan.interp_Pearson_NULL_res <- Scan.interp(miRNA_scRNA_norm_filter, mRNA_scRNA_norm_filter, method = "Pearson")
Scan.interp_Pearson_timeend <- Sys.time()
Scan.interp_Pearson_runningtime_NULL <- as.numeric(difftime(Scan.interp_Pearson_timeend, Scan.interp_Pearson_timestart, units = "secs"))
Scan.interp_Euclidean_timestart <- Sys.time()
Scan.interp_Euclidean_NULL_res <- Scan.interp(miRNA_scRNA_norm_filter, mRNA_scRNA_norm_filter, method = "Euclidean")
Scan.interp_Euclidean_timeend <- Sys.time()
Scan.interp_Euclidean_runningtime_NULL <- as.numeric(difftime(Scan.interp_Euclidean_timeend, Scan.interp_Euclidean_timestart, units = "secs"))
Scan.interp_MI_timestart <- Sys.time()
Scan.interp_MI_NULL_res <- Scan.interp(miRNA_scRNA_norm_filter, mRNA_scRNA_norm_filter, method = "MI")
Scan.interp_MI_timeend <- Sys.time()
Scan.interp_MI_runningtime_NULL <- as.numeric(difftime(Scan.interp_MI_timeend, Scan.interp_MI_timestart, units = "secs"))
Scan.interp_Lasso_timestart <- Sys.time()
Scan.interp_Lasso_NULL_res <- Scan.interp(miRNA_scRNA_norm_filter, mRNA_scRNA_norm_filter, method = "Lasso")
Scan.interp_Lasso_timeend <- Sys.time()
Scan.interp_Lasso_runningtime_NULL <- as.numeric(difftime(Scan.interp_Lasso_timeend, Scan.interp_Lasso_timestart, units = "secs"))
Scan.interp_Phit_timestart <- Sys.time()
Scan.interp_Phit_NULL_res <- Scan.interp(miRNA_scRNA_norm_filter, mRNA_scRNA_norm_filter, method = "Phit")
Scan.interp_Phit_timeend <- Sys.time()
Scan.interp_Phit_runningtime_NULL <- as.numeric(difftime(Scan.interp_Phit_timeend, Scan.interp_Phit_timestart, units = "secs"))
# Prior information of TargetScan with Scan.interp
Scan.interp_Pearson_TargetScan_res <- lapply(seq(Scan.interp_Pearson_NULL_res), function(i) Scan.interp_Pearson_NULL_res[[i]] %s% TargetScan_graph)
Scan.interp_Euclidean_TargetScan_res <- lapply(seq(Scan.interp_Euclidean_NULL_res), function(i) Scan.interp_Euclidean_NULL_res[[i]] %s% TargetScan_graph)
Scan.interp_MI_TargetScan_res <- lapply(seq(Scan.interp_MI_NULL_res), function(i) Scan.interp_MI_NULL_res[[i]] %s% TargetScan_graph)
Scan.interp_Lasso_TargetScan_res <- lapply(seq(Scan.interp_Lasso_NULL_res), function(i) Scan.interp_Lasso_NULL_res[[i]] %s% TargetScan_graph)
Scan.interp_Phit_TargetScan_res <- lapply(seq(Scan.interp_Phit_NULL_res), function(i) Scan.interp_Phit_NULL_res[[i]] %s% TargetScan_graph)
# Prior information of ENCORI with Scan.interp
Scan.interp_Pearson_ENCORI_res <- lapply(seq(Scan.interp_Pearson_NULL_res), function(i) Scan.interp_Pearson_NULL_res[[i]] %s% ENCORI_graph)
Scan.interp_Euclidean_ENCORI_res <- lapply(seq(Scan.interp_Euclidean_NULL_res), function(i) Scan.interp_Euclidean_NULL_res[[i]] %s% ENCORI_graph)
Scan.interp_MI_ENCORI_res <- lapply(seq(Scan.interp_MI_NULL_res), function(i) Scan.interp_MI_NULL_res[[i]] %s% ENCORI_graph)
Scan.interp_Lasso_ENCORI_res <- lapply(seq(Scan.interp_Lasso_NULL_res), function(i) Scan.interp_Lasso_NULL_res[[i]] %s% ENCORI_graph)
Scan.interp_Phit_ENCORI_res <- lapply(seq(Scan.interp_Phit_NULL_res), function(i) Scan.interp_Phit_NULL_res[[i]] %s% ENCORI_graph)
# No prior information with Scan.perturb
source("R/Scan.perturb.R")
Scan.perturb_Pearson_timestart <- Sys.time()
Scan.perturb_Pearson_NULL_res <- Scan.perturb(miRNA_scRNA_norm_filter, mRNA_scRNA_norm_filter, method = "Pearson")
Scan.perturb_Pearson_timeend <- Sys.time()
Scan.perturb_Pearson_runningtime_NULL <- as.numeric(difftime(Scan.perturb_Pearson_timeend, Scan.perturb_Pearson_timestart, units = "secs"))
Scan.perturb_Euclidean_timestart <- Sys.time()
Scan.perturb_Euclidean_NULL_res <- Scan.perturb(miRNA_scRNA_norm_filter, mRNA_scRNA_norm_filter, method = "Euclidean")
Scan.perturb_Euclidean_timeend <- Sys.time()
Scan.perturb_Euclidean_runningtime_NULL <- as.numeric(difftime(Scan.perturb_Euclidean_timeend, Scan.perturb_Euclidean_timestart, units = "secs"))
Scan.perturb_MI_timestart <- Sys.time()
Scan.perturb_MI_NULL_res <- Scan.perturb(miRNA_scRNA_norm_filter, mRNA_scRNA_norm_filter, method = "MI")
Scan.perturb_MI_timeend <- Sys.time()
Scan.perturb_MI_runningtime_NULL <- as.numeric(difftime(Scan.perturb_MI_timeend, Scan.perturb_MI_timestart, units = "secs"))
Scan.perturb_Lasso_timestart <- Sys.time()
Scan.perturb_Lasso_NULL_res <- Scan.perturb(miRNA_scRNA_norm_filter, mRNA_scRNA_norm_filter, method = "Lasso")
Scan.perturb_Lasso_timeend <- Sys.time()
Scan.perturb_Lasso_runningtime_NULL <- as.numeric(difftime(Scan.perturb_Lasso_timeend, Scan.perturb_Lasso_timestart, units = "secs"))
Scan.perturb_Phit_timestart <- Sys.time()
Scan.perturb_Phit_NULL_res <- Scan.perturb(miRNA_scRNA_norm_filter, mRNA_scRNA_norm_filter, method = "Phit")
Scan.perturb_Phit_timeend <- Sys.time()
Scan.perturb_Phit_runningtime_NULL <- as.numeric(difftime(Scan.perturb_Phit_timeend, Scan.perturb_Phit_timestart, units = "secs"))
# Prior information of TargetScan with Scan.perturb
Scan.perturb_Pearson_TargetScan_res <- lapply(seq(Scan.perturb_Pearson_NULL_res), function(i) Scan.perturb_Pearson_NULL_res[[i]] %s% TargetScan_graph)
Scan.perturb_Euclidean_TargetScan_res <- lapply(seq(Scan.perturb_Euclidean_NULL_res), function(i) Scan.perturb_Euclidean_NULL_res[[i]] %s% TargetScan_graph)
Scan.perturb_MI_TargetScan_res <- lapply(seq(Scan.perturb_MI_NULL_res), function(i) Scan.perturb_MI_NULL_res[[i]] %s% TargetScan_graph)
Scan.perturb_Lasso_TargetScan_res <- lapply(seq(Scan.perturb_Lasso_NULL_res), function(i) Scan.perturb_Lasso_NULL_res[[i]] %s% TargetScan_graph)
Scan.perturb_Phit_TargetScan_res <- lapply(seq(Scan.perturb_Phit_NULL_res), function(i) Scan.perturb_Phit_NULL_res[[i]] %s% TargetScan_graph)
# Prior information of ENCORI with Scan.perturb
Scan.perturb_Pearson_ENCORI_res <- lapply(seq(Scan.perturb_Pearson_NULL_res), function(i) Scan.perturb_Pearson_NULL_res[[i]] %s% ENCORI_graph)
Scan.perturb_Euclidean_ENCORI_res <- lapply(seq(Scan.perturb_Euclidean_NULL_res), function(i) Scan.perturb_Euclidean_NULL_res[[i]] %s% ENCORI_graph)
Scan.perturb_MI_ENCORI_res <- lapply(seq(Scan.perturb_MI_NULL_res), function(i) Scan.perturb_MI_NULL_res[[i]] %s% ENCORI_graph)
Scan.perturb_Lasso_ENCORI_res <- lapply(seq(Scan.perturb_Lasso_NULL_res), function(i) Scan.perturb_Lasso_NULL_res[[i]] %s% ENCORI_graph)
Scan.perturb_Phit_ENCORI_res <- lapply(seq(Scan.perturb_Phit_NULL_res), function(i) Scan.perturb_Phit_NULL_res[[i]] %s% ENCORI_graph)
```
# Accuracy comparison
For accuracy comparison, the ground truth of miRNA-mRNA interactions are acquired from miRTarBase v9.0 [26] and TarBase v8.0 [27] at multi-sample level for validation. We have 10 combinations (5 network inference methods and 2 strategies), if a combination has a larger average percentage of validated miRNA-mRNA interactions, the combination will have higher accuracy.
```{r, eval=TRUE, include=TRUE}
# Number of predicted sample-specific interactions using Scan.interp
Scan.interp_Pearson_NULL_res_num <- unlist(lapply(seq(Scan.interp_Pearson_NULL_res), function(i) nrow(as_data_frame(Scan.interp_Pearson_NULL_res[[i]] ))))
Scan.interp_Pearson_TargetScan_res_num <- unlist(lapply(seq(Scan.interp_Pearson_TargetScan_res), function(i) nrow(as_data_frame(Scan.interp_Pearson_TargetScan_res[[i]] ))))
Scan.interp_Pearson_ENCORI_res_num <- unlist(lapply(seq(Scan.interp_Pearson_ENCORI_res), function(i) nrow(as_data_frame(Scan.interp_Pearson_ENCORI_res[[i]] ))))
Scan.interp_Euclidean_NULL_res_num <- unlist(lapply(seq(Scan.interp_Euclidean_NULL_res), function(i) nrow(as_data_frame(Scan.interp_Euclidean_NULL_res[[i]] ))))
Scan.interp_Euclidean_TargetScan_res_num <- unlist(lapply(seq(Scan.interp_Euclidean_TargetScan_res), function(i) nrow(as_data_frame(Scan.interp_Euclidean_TargetScan_res[[i]] ))))
Scan.interp_Euclidean_ENCORI_res_num <- unlist(lapply(seq(Scan.interp_Euclidean_ENCORI_res), function(i) nrow(as_data_frame(Scan.interp_Euclidean_ENCORI_res[[i]] ))))
Scan.interp_MI_NULL_res_num <- unlist(lapply(seq(Scan.interp_MI_NULL_res), function(i) nrow(as_data_frame(Scan.interp_MI_NULL_res[[i]] ))))
Scan.interp_MI_TargetScan_res_num <- unlist(lapply(seq(Scan.interp_MI_TargetScan_res), function(i) nrow(as_data_frame(Scan.interp_MI_TargetScan_res[[i]] ))))
Scan.interp_MI_ENCORI_res_num <- unlist(lapply(seq(Scan.interp_MI_ENCORI_res), function(i) nrow(as_data_frame(Scan.interp_MI_ENCORI_res[[i]] ))))
Scan.interp_Lasso_NULL_res_num <- unlist(lapply(seq(Scan.interp_Lasso_NULL_res), function(i) nrow(as_data_frame(Scan.interp_Lasso_NULL_res[[i]] ))))
Scan.interp_Lasso_TargetScan_res_num <- unlist(lapply(seq(Scan.interp_Lasso_TargetScan_res), function(i) nrow(as_data_frame(Scan.interp_Lasso_TargetScan_res[[i]] ))))
Scan.interp_Lasso_ENCORI_res_num <- unlist(lapply(seq(Scan.interp_Lasso_ENCORI_res), function(i) nrow(as_data_frame(Scan.interp_Lasso_ENCORI_res[[i]] ))))
Scan.interp_Phit_NULL_res_num <- unlist(lapply(seq(Scan.interp_Phit_NULL_res), function(i) nrow(as_data_frame(Scan.interp_Phit_NULL_res[[i]] ))))
Scan.interp_Phit_TargetScan_res_num <- unlist(lapply(seq(Scan.interp_Phit_TargetScan_res), function(i) nrow(as_data_frame(Scan.interp_Phit_TargetScan_res[[i]] ))))
Scan.interp_Phit_ENCORI_res_num <- unlist(lapply(seq(Scan.interp_Phit_ENCORI_res), function(i) nrow(as_data_frame(Scan.interp_Phit_ENCORI_res[[i]] ))))
# Experimentally validated sample-specific miRNA-mRNA interactions using Scan.interp
miRTarget_groundtruth <- as.matrix(read.csv("Data/miRTarBase_v9.0+TarBase_v8.0.csv", header = TRUE, sep=","))
miRTarget_groundtruth_graph <- make_graph(c(t(miRTarget_groundtruth[, 1:2])), directed = FALSE)
Scan.interp_Pearson_NULL_res_validated <- lapply(seq(Scan.interp_Pearson_NULL_res), function(i) as_data_frame(Scan.interp_Pearson_NULL_res[[i]] %s% miRTarget_groundtruth_graph))
Scan.interp_Pearson_TargetScan_res_validated <- lapply(seq(Scan.interp_Pearson_TargetScan_res), function(i) as_data_frame(Scan.interp_Pearson_TargetScan_res[[i]] %s% miRTarget_groundtruth_graph))
Scan.interp_Pearson_ENCORI_res_validated <- lapply(seq(Scan.interp_Pearson_ENCORI_res), function(i) as_data_frame(Scan.interp_Pearson_ENCORI_res[[i]] %s% miRTarget_groundtruth_graph))
Scan.interp_Euclidean_NULL_res_validated <- lapply(seq(Scan.interp_Euclidean_NULL_res), function(i) as_data_frame(Scan.interp_Euclidean_NULL_res[[i]] %s% miRTarget_groundtruth_graph))
Scan.interp_Euclidean_TargetScan_res_validated <- lapply(seq(Scan.interp_Euclidean_TargetScan_res), function(i) as_data_frame(Scan.interp_Euclidean_TargetScan_res[[i]] %s% miRTarget_groundtruth_graph))
Scan.interp_Euclidean_ENCORI_res_validated <- lapply(seq(Scan.interp_Euclidean_ENCORI_res), function(i) as_data_frame(Scan.interp_Euclidean_ENCORI_res[[i]] %s% miRTarget_groundtruth_graph))
Scan.interp_MI_NULL_res_validated <- lapply(seq(Scan.interp_MI_NULL_res), function(i) as_data_frame(Scan.interp_MI_NULL_res[[i]] %s% miRTarget_groundtruth_graph))
Scan.interp_MI_TargetScan_res_validated <- lapply(seq(Scan.interp_MI_TargetScan_res), function(i) as_data_frame(Scan.interp_MI_TargetScan_res[[i]] %s% miRTarget_groundtruth_graph))
Scan.interp_MI_ENCORI_res_validated <- lapply(seq(Scan.interp_MI_ENCORI_res), function(i) as_data_frame(Scan.interp_MI_ENCORI_res[[i]] %s% miRTarget_groundtruth_graph))
Scan.interp_Lasso_NULL_res_validated <- lapply(seq(Scan.interp_Lasso_NULL_res), function(i) as_data_frame(Scan.interp_Lasso_NULL_res[[i]] %s% miRTarget_groundtruth_graph))
Scan.interp_Lasso_TargetScan_res_validated <- lapply(seq(Scan.interp_Lasso_TargetScan_res), function(i) as_data_frame(Scan.interp_Lasso_TargetScan_res[[i]] %s% miRTarget_groundtruth_graph))
Scan.interp_Lasso_ENCORI_res_validated <- lapply(seq(Scan.interp_Lasso_ENCORI_res), function(i) as_data_frame(Scan.interp_Lasso_ENCORI_res[[i]] %s% miRTarget_groundtruth_graph))
Scan.interp_Phit_NULL_res_validated <- lapply(seq(Scan.interp_Phit_NULL_res), function(i) as_data_frame(Scan.interp_Phit_NULL_res[[i]] %s% miRTarget_groundtruth_graph))
Scan.interp_Phit_TargetScan_res_validated <- lapply(seq(Scan.interp_Phit_TargetScan_res), function(i) as_data_frame(Scan.interp_Phit_TargetScan_res[[i]] %s% miRTarget_groundtruth_graph))
Scan.interp_Phit_ENCORI_res_validated <- lapply(seq(Scan.interp_Phit_ENCORI_res), function(i) as_data_frame(Scan.interp_Phit_ENCORI_res[[i]] %s% miRTarget_groundtruth_graph))
## Percentage of experimentally validated sample-specific miRNA-mRNA interactions using Scan.interp
Scan.interp_Pearson_NULL_res_validated_per <- unlist(lapply(seq(Scan.interp_Pearson_NULL_res), function(i) 100*nrow(Scan.interp_Pearson_NULL_res_validated[[i]])/nrow(as_data_frame(Scan.interp_Pearson_NULL_res[[i]]))))
Scan.interp_Pearson_TargetScan_res_validated_per <- unlist(lapply(seq(Scan.interp_Pearson_TargetScan_res), function(i) 100*nrow(Scan.interp_Pearson_TargetScan_res_validated[[i]])/nrow(as_data_frame(Scan.interp_Pearson_TargetScan_res[[i]]))))
Scan.interp_Pearson_ENCORI_res_validated_per <- unlist(lapply(seq(Scan.interp_Pearson_ENCORI_res), function(i) 100*nrow(Scan.interp_Pearson_ENCORI_res_validated[[i]])/nrow(as_data_frame(Scan.interp_Pearson_ENCORI_res[[i]]))))
Scan.interp_Euclidean_NULL_res_validated_per <- unlist(lapply(seq(Scan.interp_Euclidean_NULL_res), function(i) 100*nrow(Scan.interp_Euclidean_NULL_res_validated[[i]])/nrow(as_data_frame(Scan.interp_Euclidean_NULL_res[[i]]))))
Scan.interp_Euclidean_TargetScan_res_validated_per <- unlist(lapply(seq(Scan.interp_Euclidean_TargetScan_res), function(i) 100*nrow(Scan.interp_Euclidean_TargetScan_res_validated[[i]])/nrow(as_data_frame(Scan.interp_Euclidean_TargetScan_res[[i]]))))
Scan.interp_Euclidean_ENCORI_res_validated_per <- unlist(lapply(seq(Scan.interp_Euclidean_ENCORI_res), function(i) 100*nrow(Scan.interp_Euclidean_ENCORI_res_validated[[i]])/nrow(as_data_frame(Scan.interp_Euclidean_ENCORI_res[[i]]))))
Scan.interp_MI_NULL_res_validated_per <- unlist(lapply(seq(Scan.interp_MI_NULL_res), function(i) 100*nrow(Scan.interp_MI_NULL_res_validated[[i]])/nrow(as_data_frame(Scan.interp_MI_NULL_res[[i]]))))
Scan.interp_MI_TargetScan_res_validated_per <- unlist(lapply(seq(Scan.interp_MI_TargetScan_res), function(i) 100*nrow(Scan.interp_MI_TargetScan_res_validated[[i]])/nrow(as_data_frame(Scan.interp_MI_TargetScan_res[[i]]))))
Scan.interp_MI_ENCORI_res_validated_per <- unlist(lapply(seq(Scan.interp_MI_ENCORI_res), function(i) 100*nrow(Scan.interp_MI_ENCORI_res_validated[[i]])/nrow(as_data_frame(Scan.interp_MI_ENCORI_res[[i]]))))
Scan.interp_Lasso_NULL_res_validated_per <- unlist(lapply(seq(Scan.interp_Lasso_NULL_res), function(i) 100*nrow(Scan.interp_Lasso_NULL_res_validated[[i]])/nrow(as_data_frame(Scan.interp_Lasso_NULL_res[[i]]))))
Scan.interp_Lasso_TargetScan_res_validated_per <- unlist(lapply(seq(Scan.interp_Lasso_TargetScan_res), function(i) 100*nrow(Scan.interp_Lasso_TargetScan_res_validated[[i]])/nrow(as_data_frame(Scan.interp_Lasso_TargetScan_res[[i]]))))
Scan.interp_Lasso_ENCORI_res_validated_per <- unlist(lapply(seq(Scan.interp_Lasso_ENCORI_res), function(i) 100*nrow(Scan.interp_Lasso_ENCORI_res_validated[[i]])/nrow(as_data_frame(Scan.interp_Lasso_ENCORI_res[[i]]))))
Scan.interp_Phit_NULL_res_validated_per <- unlist(lapply(seq(Scan.interp_Phit_NULL_res), function(i) 100*nrow(Scan.interp_Phit_NULL_res_validated[[i]])/nrow(as_data_frame(Scan.interp_Phit_NULL_res[[i]]))))
Scan.interp_Phit_TargetScan_res_validated_per <- unlist(lapply(seq(Scan.interp_Phit_TargetScan_res), function(i) 100*nrow(Scan.interp_Phit_TargetScan_res_validated[[i]])/nrow(as_data_frame(Scan.interp_Phit_TargetScan_res[[i]]))))
Scan.interp_Phit_ENCORI_res_validated_per <- unlist(lapply(seq(Scan.interp_Phit_ENCORI_res), function(i) 100*nrow(Scan.interp_Phit_ENCORI_res_validated[[i]])/nrow(as_data_frame(Scan.interp_Phit_ENCORI_res[[i]]))))
# Number of predicted sample-specific interactions using Scan.perturb
Scan.perturb_Pearson_NULL_res_num <- unlist(lapply(seq(Scan.perturb_Pearson_NULL_res), function(i) nrow(as_data_frame(Scan.perturb_Pearson_NULL_res[[i]] ))))
Scan.perturb_Pearson_TargetScan_res_num <- unlist(lapply(seq(Scan.perturb_Pearson_TargetScan_res), function(i) nrow(as_data_frame(Scan.perturb_Pearson_TargetScan_res[[i]] ))))
Scan.perturb_Pearson_ENCORI_res_num <- unlist(lapply(seq(Scan.perturb_Pearson_ENCORI_res), function(i) nrow(as_data_frame(Scan.perturb_Pearson_ENCORI_res[[i]] ))))
Scan.perturb_Euclidean_NULL_res_num <- unlist(lapply(seq(Scan.perturb_Euclidean_NULL_res), function(i) nrow(as_data_frame(Scan.perturb_Euclidean_NULL_res[[i]] ))))
Scan.perturb_Euclidean_TargetScan_res_num <- unlist(lapply(seq(Scan.perturb_Euclidean_TargetScan_res), function(i) nrow(as_data_frame(Scan.perturb_Euclidean_TargetScan_res[[i]] ))))
Scan.perturb_Euclidean_ENCORI_res_num <- unlist(lapply(seq(Scan.perturb_Euclidean_ENCORI_res), function(i) nrow(as_data_frame(Scan.perturb_Euclidean_ENCORI_res[[i]] ))))
Scan.perturb_MI_NULL_res_num <- unlist(lapply(seq(Scan.perturb_MI_NULL_res), function(i) nrow(as_data_frame(Scan.perturb_MI_NULL_res[[i]] ))))
Scan.perturb_MI_TargetScan_res_num <- unlist(lapply(seq(Scan.perturb_MI_TargetScan_res), function(i) nrow(as_data_frame(Scan.perturb_MI_TargetScan_res[[i]] ))))
Scan.perturb_MI_ENCORI_res_num <- unlist(lapply(seq(Scan.perturb_MI_ENCORI_res), function(i) nrow(as_data_frame(Scan.perturb_MI_ENCORI_res[[i]] ))))
Scan.perturb_Lasso_NULL_res_num <- unlist(lapply(seq(Scan.perturb_Lasso_NULL_res), function(i) nrow(as_data_frame(Scan.perturb_Lasso_NULL_res[[i]] ))))
Scan.perturb_Lasso_TargetScan_res_num <- unlist(lapply(seq(Scan.perturb_Lasso_TargetScan_res), function(i) nrow(as_data_frame(Scan.perturb_Lasso_TargetScan_res[[i]] ))))
Scan.perturb_Lasso_ENCORI_res_num <- unlist(lapply(seq(Scan.perturb_Lasso_ENCORI_res), function(i) nrow(as_data_frame(Scan.perturb_Lasso_ENCORI_res[[i]] ))))
Scan.perturb_Phit_NULL_res_num <- unlist(lapply(seq(Scan.perturb_Phit_NULL_res), function(i) nrow(as_data_frame(Scan.perturb_Phit_NULL_res[[i]] ))))
Scan.perturb_Phit_TargetScan_res_num <- unlist(lapply(seq(Scan.perturb_Phit_TargetScan_res), function(i) nrow(as_data_frame(Scan.perturb_Phit_TargetScan_res[[i]] ))))
Scan.perturb_Phit_ENCORI_res_num <- unlist(lapply(seq(Scan.perturb_Phit_ENCORI_res), function(i) nrow(as_data_frame(Scan.perturb_Phit_ENCORI_res[[i]] ))))
# Experimentally validated sample-specific miRNA-mRNA interactions using Scan.perturb
miRTarget_groundtruth <- as.matrix(read.csv("Data/miRTarBase_v9.0+TarBase_v8.0.csv", header = TRUE, sep=","))
miRTarget_groundtruth_graph <- make_graph(c(t(miRTarget_groundtruth[, 1:2])), directed = FALSE)
Scan.perturb_Pearson_NULL_res_validated <- lapply(seq(Scan.perturb_Pearson_NULL_res), function(i) as_data_frame(Scan.perturb_Pearson_NULL_res[[i]] %s% miRTarget_groundtruth_graph))
Scan.perturb_Pearson_TargetScan_res_validated <- lapply(seq(Scan.perturb_Pearson_TargetScan_res), function(i) as_data_frame(Scan.perturb_Pearson_TargetScan_res[[i]] %s% miRTarget_groundtruth_graph))
Scan.perturb_Pearson_ENCORI_res_validated <- lapply(seq(Scan.perturb_Pearson_ENCORI_res), function(i) as_data_frame(Scan.perturb_Pearson_ENCORI_res[[i]] %s% miRTarget_groundtruth_graph))
Scan.perturb_Euclidean_NULL_res_validated <- lapply(seq(Scan.perturb_Euclidean_NULL_res), function(i) as_data_frame(Scan.perturb_Euclidean_NULL_res[[i]] %s% miRTarget_groundtruth_graph))
Scan.perturb_Euclidean_TargetScan_res_validated <- lapply(seq(Scan.perturb_Euclidean_TargetScan_res), function(i) as_data_frame(Scan.perturb_Euclidean_TargetScan_res[[i]] %s% miRTarget_groundtruth_graph))
Scan.perturb_Euclidean_ENCORI_res_validated <- lapply(seq(Scan.perturb_Euclidean_ENCORI_res), function(i) as_data_frame(Scan.perturb_Euclidean_ENCORI_res[[i]] %s% miRTarget_groundtruth_graph))
Scan.perturb_MI_NULL_res_validated <- lapply(seq(Scan.perturb_MI_NULL_res), function(i) as_data_frame(Scan.perturb_MI_NULL_res[[i]] %s% miRTarget_groundtruth_graph))
Scan.perturb_MI_TargetScan_res_validated <- lapply(seq(Scan.perturb_MI_TargetScan_res), function(i) as_data_frame(Scan.perturb_MI_TargetScan_res[[i]] %s% miRTarget_groundtruth_graph))
Scan.perturb_MI_ENCORI_res_validated <- lapply(seq(Scan.perturb_MI_ENCORI_res), function(i) as_data_frame(Scan.perturb_MI_ENCORI_res[[i]] %s% miRTarget_groundtruth_graph))
Scan.perturb_Lasso_NULL_res_validated <- lapply(seq(Scan.perturb_Lasso_NULL_res), function(i) as_data_frame(Scan.perturb_Lasso_NULL_res[[i]] %s% miRTarget_groundtruth_graph))
Scan.perturb_Lasso_TargetScan_res_validated <- lapply(seq(Scan.perturb_Lasso_TargetScan_res), function(i) as_data_frame(Scan.perturb_Lasso_TargetScan_res[[i]] %s% miRTarget_groundtruth_graph))
Scan.perturb_Lasso_ENCORI_res_validated <- lapply(seq(Scan.perturb_Lasso_ENCORI_res), function(i) as_data_frame(Scan.perturb_Lasso_ENCORI_res[[i]] %s% miRTarget_groundtruth_graph))
Scan.perturb_Phit_NULL_res_validated <- lapply(seq(Scan.perturb_Phit_NULL_res), function(i) as_data_frame(Scan.perturb_Phit_NULL_res[[i]] %s% miRTarget_groundtruth_graph))
Scan.perturb_Phit_TargetScan_res_validated <- lapply(seq(Scan.perturb_Phit_TargetScan_res), function(i) as_data_frame(Scan.perturb_Phit_TargetScan_res[[i]] %s% miRTarget_groundtruth_graph))
Scan.perturb_Phit_ENCORI_res_validated <- lapply(seq(Scan.perturb_Phit_ENCORI_res), function(i) as_data_frame(Scan.perturb_Phit_ENCORI_res[[i]] %s% miRTarget_groundtruth_graph))
## Percentage of experimentally validated sample-specific miRNA-mRNA interactions using Scan.perturb
Scan.perturb_Pearson_NULL_res_validated_per <- unlist(lapply(seq(Scan.perturb_Pearson_NULL_res), function(i) 100*nrow(Scan.perturb_Pearson_NULL_res_validated[[i]])/nrow(as_data_frame(Scan.perturb_Pearson_NULL_res[[i]]))))
Scan.perturb_Pearson_TargetScan_res_validated_per <- unlist(lapply(seq(Scan.perturb_Pearson_TargetScan_res), function(i) 100*nrow(Scan.perturb_Pearson_TargetScan_res_validated[[i]])/nrow(as_data_frame(Scan.perturb_Pearson_TargetScan_res[[i]]))))
Scan.perturb_Pearson_ENCORI_res_validated_per <- unlist(lapply(seq(Scan.perturb_Pearson_ENCORI_res), function(i) 100*nrow(Scan.perturb_Pearson_ENCORI_res_validated[[i]])/nrow(as_data_frame(Scan.perturb_Pearson_ENCORI_res[[i]]))))
Scan.perturb_Euclidean_NULL_res_validated_per <- unlist(lapply(seq(Scan.perturb_Euclidean_NULL_res), function(i) 100*nrow(Scan.perturb_Euclidean_NULL_res_validated[[i]])/nrow(as_data_frame(Scan.perturb_Euclidean_NULL_res[[i]]))))
Scan.perturb_Euclidean_TargetScan_res_validated_per <- unlist(lapply(seq(Scan.perturb_Euclidean_TargetScan_res), function(i) 100*nrow(Scan.perturb_Euclidean_TargetScan_res_validated[[i]])/nrow(as_data_frame(Scan.perturb_Euclidean_TargetScan_res[[i]]))))
Scan.perturb_Euclidean_ENCORI_res_validated_per <- unlist(lapply(seq(Scan.perturb_Euclidean_ENCORI_res), function(i) 100*nrow(Scan.perturb_Euclidean_ENCORI_res_validated[[i]])/nrow(as_data_frame(Scan.perturb_Euclidean_ENCORI_res[[i]]))))
Scan.perturb_MI_NULL_res_validated_per <- unlist(lapply(seq(Scan.perturb_MI_NULL_res), function(i) 100*nrow(Scan.perturb_MI_NULL_res_validated[[i]])/nrow(as_data_frame(Scan.perturb_MI_NULL_res[[i]]))))
Scan.perturb_MI_TargetScan_res_validated_per <- unlist(lapply(seq(Scan.perturb_MI_TargetScan_res), function(i) 100*nrow(Scan.perturb_MI_TargetScan_res_validated[[i]])/nrow(as_data_frame(Scan.perturb_MI_TargetScan_res[[i]]))))
Scan.perturb_MI_ENCORI_res_validated_per <- unlist(lapply(seq(Scan.perturb_MI_ENCORI_res), function(i) 100*nrow(Scan.perturb_MI_ENCORI_res_validated[[i]])/nrow(as_data_frame(Scan.perturb_MI_ENCORI_res[[i]]))))
Scan.perturb_Lasso_NULL_res_validated_per <- unlist(lapply(seq(Scan.perturb_Lasso_NULL_res), function(i) 100*nrow(Scan.perturb_Lasso_NULL_res_validated[[i]])/nrow(as_data_frame(Scan.perturb_Lasso_NULL_res[[i]]))))
Scan.perturb_Lasso_TargetScan_res_validated_per <- unlist(lapply(seq(Scan.perturb_Lasso_TargetScan_res), function(i) 100*nrow(Scan.perturb_Lasso_TargetScan_res_validated[[i]])/nrow(as_data_frame(Scan.perturb_Lasso_TargetScan_res[[i]]))))
Scan.perturb_Lasso_ENCORI_res_validated_per <- unlist(lapply(seq(Scan.perturb_Lasso_ENCORI_res), function(i) 100*nrow(Scan.perturb_Lasso_ENCORI_res_validated[[i]])/nrow(as_data_frame(Scan.perturb_Lasso_ENCORI_res[[i]]))))
Scan.perturb_Phit_NULL_res_validated_per <- unlist(lapply(seq(Scan.perturb_Phit_NULL_res), function(i) 100*nrow(Scan.perturb_Phit_NULL_res_validated[[i]])/nrow(as_data_frame(Scan.perturb_Phit_NULL_res[[i]]))))
Scan.perturb_Phit_TargetScan_res_validated_per <- unlist(lapply(seq(Scan.perturb_Phit_TargetScan_res), function(i) 100*nrow(Scan.perturb_Phit_TargetScan_res_validated[[i]])/nrow(as_data_frame(Scan.perturb_Phit_TargetScan_res[[i]]))))
Scan.perturb_Phit_ENCORI_res_validated_per <- unlist(lapply(seq(Scan.perturb_Phit_ENCORI_res), function(i) 100*nrow(Scan.perturb_Phit_ENCORI_res_validated[[i]])/nrow(as_data_frame(Scan.perturb_Phit_ENCORI_res[[i]]))))
```
A combination with higher accuracy will obtain a larger rank score. A combination with a larger rank score is regarded as a better or practical combination.
```{r, eval=TRUE, include=TRUE}
## Calculate rank score of 10 combinations (5 network inference methods and 2 strategies)
AP_None <- c(mean(Scan.interp_Pearson_NULL_res_validated_per), mean(Scan.interp_Euclidean_NULL_res_validated_per), mean(Scan.interp_MI_NULL_res_validated_per), mean(Scan.interp_Lasso_NULL_res_validated_per), mean(Scan.interp_Phit_NULL_res_validated_per),
mean(Scan.perturb_Pearson_NULL_res_validated_per), mean(Scan.perturb_Euclidean_NULL_res_validated_per), mean(Scan.perturb_MI_NULL_res_validated_per), mean(Scan.perturb_Lasso_NULL_res_validated_per), mean(Scan.perturb_Phit_NULL_res_validated_per))
AP_TargetScan <- c(mean(Scan.interp_Pearson_TargetScan_res_validated_per), mean(Scan.interp_Euclidean_TargetScan_res_validated_per), mean(Scan.interp_MI_TargetScan_res_validated_per), mean(Scan.interp_Lasso_TargetScan_res_validated_per), mean(Scan.interp_Phit_TargetScan_res_validated_per),
mean(Scan.perturb_Pearson_TargetScan_res_validated_per), mean(Scan.perturb_Euclidean_TargetScan_res_validated_per), mean(Scan.perturb_MI_TargetScan_res_validated_per), mean(Scan.perturb_Lasso_TargetScan_res_validated_per), mean(Scan.perturb_Phit_TargetScan_res_validated_per))
AP_ENCORI <- c(mean(Scan.interp_Pearson_TargetScan_res_validated_per), mean(Scan.interp_Euclidean_TargetScan_res_validated_per), mean(Scan.interp_MI_NULL_res_validated_per), mean(Scan.interp_Lasso_TargetScan_res_validated_per), mean(Scan.interp_Phit_TargetScan_res_validated_per),
mean(Scan.perturb_Pearson_TargetScan_res_validated_per), mean(Scan.perturb_Euclidean_TargetScan_res_validated_per), mean(Scan.perturb_MI_NULL_res_validated_per), mean(Scan.perturb_Lasso_TargetScan_res_validated_per), mean(Scan.perturb_Phit_TargetScan_res_validated_per))
AP_None_rank <- rank(AP_None)
AP_TargetScan_rank <- rank(AP_TargetScan)
AP_ENCORI_rank <- rank(AP_ENCORI)
AP_rank <- (AP_None_rank + AP_TargetScan_rank + AP_ENCORI_rank)/3
AP_rank
```
# Efficiency comparison
For efficiency comparison, we compare the runtime of different combinations in the K562 single-cell RNA-sequencing data. If a combination takes less runtime in the K562 single-cell RNA-sequencing data, the combination will obtain a larger rank score and have better efficiency.
```{r, eval=TRUE, include=TRUE}
## Calculate rank score of 10 combinations (5 network inference methods and 2 strategies)
Time <- c(Scan.interp_Pearson_runningtime_NULL, Scan.interp_Euclidean_runningtime_NULL, Scan.interp_MI_runningtime_NULL, Scan.interp_Lasso_runningtime_NULL, Scan.interp_Phit_runningtime_NULL, Scan.perturb_Pearson_runningtime_NULL, Scan.perturb_Euclidean_runningtime_NULL, Scan.perturb_MI_runningtime_NULL, Scan.perturb_Lasso_runningtime_NULL, Scan.perturb_Phit_runningtime_NULL)
Time_rank <- rank(-Time)
Time_rank
```
# Optimal combination selection
For selecting optimal combination, we consider both accuracy and efficiency and use an overall rank score [28] to evaluate the performance of each combination. A combination with a larger overall rank score is regarded as a optimal combination.
```{r, eval=TRUE, include=TRUE}
Overall_rank <- (AP_rank + Time_rank)/2
Overall_rank
```
# Conclusions
In this tutorial, we only list 10 combinations (5 network inference methods and 2 strategies) to show how to select an optimal combination to identify sample-specific miRNA regulation. Noted that Scan has 54 combinations (27 network inference methods and 2 strategies) to study sample-specific miRNA regulation. Taken together, Scan provides a useful method to help infer sample-specific miRNA regulation for new data, benchmark new network inference methods and deepen the understanding of miRNA regulation at the resolution of individual samples.
# References
[1] Pearson K. Notes on the history of correlation. Biometrika. 1920;13:25–45.
[2] Spearman C. “General intelligence,” objectively determined and measured. The American Journal of Psychology. 1904;15:201–92.
[3] Kendall MG. A new measure of rank correlation. Biometrika. 1938;30:81–93.
[4] Szekely GJ, Rizzo ML, Bakirov NK. Measuring and testing dependence by correlation of distances. The Annals of Statistics. 2007;35:2769–94.
[5] Lopez-Paz D, Hennig P, Schölkopf B. The randomized dependence coefficient. In: Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 1. Red Hook, NY, USA: Curran Associates Inc.; 2013. p. 1–9.
[6] Hoeffding W. A non-parametric test of independence. The Annals of Mathematical Statistics. 1948;19:546–57.
[7] Prill RJ, Marbach D, Saez-Rodriguez J, Sorger PK, Alexopoulos LG, Xue X, et al. Towards a rigorous assessment of systems biology models: the DREAM3 challenges. PLoS One. 2010;5:e9202–e9202.
[8] Wilcox R. Introduction to robust estimation and hypothesis testing. Academic Press; 2017.
[9] Zar J. Biostatistical analysis. Prentice-Hall/Pearson; 2010.
[10] Deza E, Deza M-M. Dictionary of distances. Amsterdam: Elsevier; 2006.
[11] Deza MM, Deza E. Encyclopedia of distances. In: Deza E, Deza MM, editors. Encyclopedia of Distances. Berlin, Heidelberg: Springer Berlin Heidelberg; 2009. p. 1–583.
[12] Craw S. Manhattan distance. In: Sammut C, Webb GI, editors. Encyclopedia of Machine Learning. Boston, MA: Springer US; 2010. p. 639–639.
[13] Lance GN, Williams WT. Computer programs for hierarchical polythetic classification (“similarity analyses”). The Computer Journal. 1966;9:60–4.
[14] Cantrell CD. Modern mathematical methods for physicists and engineers. Cambridge: Cambridge University Press; 2000.
[15] Dice LR. Measures of the amount of ecologic association between species. Ecology. 1945;26:297–302.
[16] Duda R, Hart P, G.Stork D. Pattern classification. In: Wiley Interscience. 2001.
[17] Mahalanobis PC. On the generalized distance in statistics. Proceedings of the National Institute of Sciences (Calcutta). 1936;2:49–55.
[18] Kraskov A, Stögbauer H, Grassberger P. Estimating mutual information. Phys Rev E. 2004;69:066138.
[19] Reshef DN, Reshef YA, Finucane HK, Grossman SR, McVean G, Turnbaugh PJ, et al. Detecting novel associations in large data sets. Science. 2011;334:1518–24.
[20] Friedman J, Hastie T, Tibshirani R. Regularization paths for generalized linear models via coordinate descent. J Stat Softw. 2010;33:1–22.
[21] Huang JC, Babak T, Corson TW, Chua G, Khan S, Gallie BL, et al. Using expression profiling data to identify human microRNA targets. Nat Methods. 2007;4:1045–9.
[22] Quinn TP, Richardson MF, Lovell D, Crowley TM. propr: An R-package for identifying proportionally abundant features using compositional data analysis. Sci Rep. 2017;7:16252.
[23] Maathuis MH, Kalisch M, Bühlmann P. Estimating high-dimensional intervention effects from observational data. The Annals of Statistics. 2009;37:3133–64.
[24] Agarwal V, Bell GW, Nam JW, Bartel DP. Predicting effective microRNA target sites in mammalian mRNAs. Elife. 2015;4:e05005.
[25] Li JH, Liu S, Zhou H, Qu LH, Yang JH. starBase v2.0: decoding miRNA-ceRNA, miRNA-ncRNA and protein-RNA interaction networks from large-scale CLIP-Seq data. Nucleic Acids Res. 2014;42 Database issue:D92-97.
[26] Huang H, Lin Y-C, Cui S, Huang Y, Tang Y, Xu J, et al. miRTarBase update 2022: an informative resource for experimentally validated miRNA-target interactions. Nucleic Acids Res. 2022;50:D222–30.
[27] Karagkouni D, Paraskevopoulou MD, Chatzopoulos S, Vlachos IS, Tastsoglou S, Kanellos I, et al. DIANA-TarBase v8: a decade-long collection of experimentally supported miRNA-gene interactions. Nucleic Acids Res. 2018;46:D239–45.
[28] Zhang J, Liu L, Xu T, Zhang W, Zhao C, Li S, et al. miRSM: an R package to infer and analyse miRNA sponge modules in heterogeneous data. RNA Biol. 2021;18:2308–20.
# Session information
```{r}
sessionInfo()
```