forked from genomicsclass/labs
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathbioc2_rpacks.Rmd
118 lines (101 loc) · 3.77 KB
/
bioc2_rpacks.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
---
layout: page
title: "Understanding and building R packages"
---
```{r options, echo=FALSE}
library(knitr)
opts_chunk$set(fig.path=paste0("figure/", sub("(.*).Rmd","\\1",basename(knitr:::knit_concord$get('infile'))), "-"))
```
```{r getpacksa,echo=FALSE,results="hide"}
suppressPackageStartupMessages({
suppressMessages({
library(AnnotationDbi)
library(ggbio)
library(gwascat)
library(GenomicRanges)
library(ERBS)
library(OrganismDbi)
library(harbChIP)
library(yeastCC)
library(TxDb.Scerevisiae.UCSC.sacCer3.sgdGene)
})
})
```
## What is an R package?
Conceptually, an R package is a collection of functions, data
objects, and documentation that coherently support a family
of related data analysis operations.
Concretely, an R package is a structured collection of folders,
organized and populated according to the rules of
[Writing R Extensions](http://cran.r-project.org/doc/manuals/r-release/R-exts.html).
<a name="skel"></a>
### A new software package with `package.skeleton`
We can create our own packages using `package.skeleton`. We'll illustrate that now
with an enhancement to the ERBS package that was created for the course.
We'll create a new package that utilizes the peak data, defining
a function `juxta` that allows us to compare binding peak patterns for the two cell
types on a selected chromosome. (I have commented out code that
uses an alternative graphics engine, for optional exploration.)
Here's a definition of `juxta`. Add it to your R session.
```{r makej}
juxta = function (chrname="chr22", ...)
{
require(ERBS)
data(HepG2)
data(GM12878)
require(ggbio)
require(GenomicRanges) # "subset" is overused, need import detail
ap1 = autoplot(GenomicRanges::subset(HepG2, seqnames==chrname))
ap2 = autoplot(GenomicRanges::subset(GM12878, seqnames==chrname))
tracks(HepG2 = ap1, Bcell = ap2, ...)
# alternative code for Gviz below
# require(Gviz)
# ap1 = AnnotationTrack(GenomicRanges::subset(HepG2, seqnames==chrname))
# names(ap1) = "HepG2"
# ap2 = AnnotationTrack(GenomicRanges::subset(GM12878, seqnames==chrname))
# names(ap2) = "B-cell"
# ax = GenomeAxisTrack()
# plotTracks(list(ax, ap1, ap2))
}
```
Now demonstrate it as follows.
```{r doj,fig=TRUE}
library(ERBS)
juxta("chr22", main="ESRRA binding peaks on chr22")
```
In the video we will show how to use `package.skeleton` and the Rstudio
editor to generate, document, and install this new package! We will not
streamline the code in `juxta` to make use of inter-package
symbol transfer by properly writing the DESCRIPTION and NAMESPACE
files for the package, but leave this for an advanced course in
software development.
<a name="org"></a>
### A new annotation package with OrganismDbi
We have found the `Homo.sapiens` package to be quite convenient.
We can get gene models, symbol to GO mappings, and so on, without
remembering any more than `keytypes`, `columns`, `keys`, and `select`.
At present there is no similar resource for *S. cerevisiae*.
We can make one, following the OrganismDbi vignette. This is
a very lightweight integrative package.
```{r doodb}
library(OrganismDbi)
gd = list( join1 = c(GO.db="GOID", org.Sc.sgd.db="GO"),
join2 = c(org.Sc.sgd.db="ENTREZID",
TxDb.Scerevisiae.UCSC.sacCer3.sgdGene="GENEID"))
if (!file.exists("Sac.cer3")) # don't do twice...
makeOrganismPackage(pkgname="Sac.cer3", # simplify typing!
graphData=gd, organism="Saccharomyces cerevisiae",
version="1.0.0", maintainer="Student <[email protected]>",
author="Student <[email protected]>",
destDir=".",
license="Artistic-2.0")
```
At this point we have a folder structure in our
working folder that can support an installation.
```{r doinst}
install.packages("Sac.cer3", repos=NULL, type="source")
library(Sac.cer3)
Sac.cer3
columns(Sac.cer3)
genes(Sac.cer3)
```