-
Notifications
You must be signed in to change notification settings - Fork 1
/
README.Rmd
465 lines (375 loc) · 17.4 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
---
output: github_document
---
```{r, include = FALSE}
knitr::opts_chunk$set(
fig.path = "man/figures/README-",
dev.args = list(png = list(type = "cairo")),
fig.retina = 2
)
```
# ggblend: Blending and compositing algebra for ggplot2
<!-- badges: start -->
[![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)
[![CRAN status](https://www.r-pkg.org/badges/version/ggblend)](https://CRAN.R-project.org/package=ggblend)
[![Codecov test coverage](https://codecov.io/gh/mjskay/ggblend/branch/main/graph/badge.svg)](https://app.codecov.io/gh/mjskay/ggblend?branch=main)
[![R-CMD-check](https://github.com/mjskay/ggblend/workflows/R-CMD-check/badge.svg)](https://github.com/mjskay/ggblend/actions)
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.7963886.svg)](https://doi.org/10.5281/zenodo.7963886)
<!-- badges: end -->
*ggblend* is a small algebra of operations for blending, copying, adjusting, and
compositing layers in *ggplot2*. It allows you to easily copy and adjust the
aesthetics or parameters of an existing layer, to partition a layer into multiple
pieces for re-composition, and to combine layers (or partitions of layers) using
blend modes (like `"multiply"`, `"overlay"`, etc).
*ggblend* requires R ≥ 4.2, as blending and compositing support was added in that
version of R.
## Installation
You can install *ggblend* from CRAN as follows:
```r
install.packages("ggblend")
```
You can install the development version of *ggblend* using:
```r
remotes::install_github("mjskay/ggblend")
```
## Blending within one geometry
We'll construct a simple dataset with two semi-overlapping point clouds. We'll
have two versions of the dataset: one with all the `"a"` points listed first,
and one with all the `"b"` points listed first.
```{r data, message=FALSE, warning=FALSE}
library(ggplot2)
library(ggblend)
theme_set(ggdist::theme_ggdist() + theme(
plot.title = element_text(size = rel(1), lineheight = 1.1, face = "bold"),
plot.subtitle = element_text(face = "italic"),
panel.border = element_rect(color = "gray75", fill = NA)
))
set.seed(1234)
df_a = data.frame(x = rnorm(500, 0), y = rnorm(500, 1), set = "a")
df_b = data.frame(x = rnorm(500, 1), y = rnorm(500, 2), set = "b")
df_ab = rbind(df_a, df_b) |>
transform(order = "draw a then b")
df_ba = rbind(df_b, df_a) |>
transform(order = "draw b then a")
df = rbind(df_ab, df_ba)
```
A typical scatterplot of such data suffers from the problem that how many
points appear to be in each group depends on the drawing order (*a then b*
versus *b then a*):
```{r scatter_noblend}
df |>
ggplot(aes(x, y, color = set)) +
geom_point(size = 3, alpha = 0.5) +
scale_color_brewer(palette = "Set1") +
facet_grid(~ order) +
labs(title = "geom_point() without blending", subtitle = "Draw order matters.")
```
A *commutative* blend mode, like `"multiply"` or `"darken"`, is one potential
solution that does not depend on drawing order. We can apply a `blend()`
operation to geom_point()` to achieve this. There three ways to do this:
- `blend(geom_point(...), "multiply")` (normal function application)
- `geom_point(...) |> blend("multiply")` (piping)
- `geom_point(...) * blend("multiply")` (algebraic operations)
Function application and piping are equivalent. **In this case**, all three
approaches are equivalent. As we will see later, the multiplication approach
is useful when we want a shorthand for applying the same operation to multiple
layers in a list without combining those layers first (in other words,
multiplication of operations over layers is *distributive* in an algebraic
sense).
```{r scatter_blend}
df |>
ggplot(aes(x, y, color = set)) +
geom_point(size = 3, alpha = 0.5) |> blend("multiply") +
scale_color_brewer(palette = "Set1") +
facet_grid(~ order) +
labs(
title = "geom_point(alpha = 0.5) |> blend('multiply')",
subtitle = "Draw order does not matter, but color is too dark."
)
```
Now the output is identical no matter the draw order, although the output is quite dark.
## Partitioning layers
Part of the reason the output is very dark above is that all of the points are being
multiply-blended together. When many objects (here, individual points) are multiply-blended on top of each
other, the output tends to get dark very quickly.
However, we really only need the two sets to be multiply-blended with each other.
Within each set, we can use regular alpha blending. To do that, we can partition the geometry
by `set` and then blend. Each partition will be blended normally within the set, and
then the resulting sets will be multiply-blended together just once:
```{r scatter_partition_blend}
df |>
ggplot(aes(x, y, color = set)) +
geom_point(size = 3, alpha = 0.5) |> partition(vars(set)) |> blend("multiply") +
scale_color_brewer(palette = "Set1") +
facet_grid(~ order) +
labs(
title = "geom_point(alpha = 0.5) |> partition(vars(set)) |> blend('multiply')",
subtitle = "Light outside the intersection, but still dark inside the intersection."
)
```
That's getting there: points outside the intersection of the two sets look good,
but the intersection is still a bit dark.
Let's try combining two blend modes to address this: we'll use a `"lighten"`
blend mode (which is also commutative) to make the overlapping regions
lighter, and then draw the `"multiply"`-blended version on top at an `alpha`
of less than 1:
```{r scatter_lighten_multiply}
df |>
ggplot(aes(x, y, color = set)) +
geom_point(size = 3, alpha = 0.5) |> partition(vars(set)) |> blend("lighten") +
geom_point(size = 3, alpha = 0.5) |> partition(vars(set)) |> blend("multiply", alpha = 0.5) +
scale_color_brewer(palette = "Set1") +
facet_grid(~ order) +
labs(
title =
"geom_point(size = 3, alpha = 0.5) |> partition(vars(set)) |> blend('lighten') + \ngeom_point(size = 3, alpha = 0.5) |> partition(vars(set)) |> blend('multiply', alpha = 0.5)",
subtitle = 'A good compromise, but a long specification.'
) +
theme(plot.subtitle = element_text(lineheight = 1.2))
```
Now it's a little easier to see both overlap and density, and the output remains independent of draw order.
However, it is a little verbose to need to copy out a layer multiple times:
```r
geom_point(size = 3, alpha = 0.5) |> partition(vars(set)) * blend("lighten") +
geom_point(size = 3, alpha = 0.5) |> partition(vars(set)) * blend("multiply", alpha = 0.5) +
```
We can simplify this is two ways: first, `partition(vars(set))` is equivalent
to setting `aes(partition = set)`, so we can move the partition specification
into the global plot aesthetics, since it is the same on every layer.
Second, operations and layers in *ggblend* act as a small algebra. Operations and sums
of operations can be multiplied by layers and lists of layers, and those
operations are distributed over the layers (This is where `*` and `|>` differ:
`|>` does not distribute operations like `blend()` over layers, which is
useful if you want to use a blend to combine multiple layers together, rather
than applying that blend to each layer individually).
Thus, we can "factor out"
`geom_point(size = 3, alpha = 0.5)` from the above expression, yielding this:
```r
geom_point(size = 3, alpha = 0.5) * (blend("lighten") + blend("multiply", alpha = 0.5))
```
Both expressions are equivalent. Thus we can rewrite the previous example
like so:
```{r scatter_lighten_multiply_stacked}
df |>
ggplot(aes(x, y, color = set, partition = set)) +
geom_point(size = 3, alpha = 0.5) * (blend("lighten") + blend("multiply", alpha = 0.5)) +
scale_color_brewer(palette = "Set1") +
facet_grid(~ order) +
labs(
title = "geom_point(aes(partition = set)) * (blend('lighten') + blend('multiply', alpha = 0.5))",
subtitle = "Two order-independent blends on one layer using the distributive law."
) +
theme(plot.subtitle = element_text(lineheight = 1.2))
```
## Blending multiple geometries
We can also blend geometries together by passing a list of geometries to `blend()`.
These lists can include already-blended geometries:
```{r scatter_blend_geom_incorrect}
df |>
ggplot(aes(x, y, color = set, partition = set)) +
list(
geom_point(size = 3, alpha = 0.5) * (blend("lighten") + blend("multiply", alpha = 0.5)),
geom_vline(xintercept = 0, color = "gray75", linewidth = 1.5),
geom_hline(yintercept = 0, color = "gray75", linewidth = 1.5)
) |> blend("hard.light") +
scale_color_brewer(palette = "Set1") +
facet_grid(~ order) +
labs(
title = "Blending multiple geometries together in a list",
subtitle = "Careful! The point layer blend is incorrect!"
)
```
Whoops!! If you look closely, the blending of the `geom_point()` layers appears to
have changed. Recall that this expression:
```r
geom_point(size = 3, alpha = 0.5) * (blend("lighten") + blend("multiply", alpha = 0.5))
```
Is equivalent to specifying two separate layers, one with `blend("lighten")`
and the other with `blend("multiply", alpha = 0.65))`. Thus, when you apply
`|> blend("hard.light")` to the `list()` of layers, it will use a hard light
blend mode to blend these two layers together, when previously they would be
blended using the normal (or `"over"`) blend mode.
We can gain back the original appearance by blending these two layers together
with `|> blend()` prior to applying the hard light blend:
```{r scatter_blend_geom}
df |>
ggplot(aes(x, y, color = set, partition = set)) +
list(
geom_point(size = 3, alpha = 0.5) * (blend("lighten") + blend("multiply", alpha = 0.5)) |> blend(),
geom_vline(xintercept = 0, color = "gray75", linewidth = 1.5),
geom_hline(yintercept = 0, color = "gray75", linewidth = 1.5)
) |> blend("hard.light") +
scale_color_brewer(palette = "Set1") +
facet_grid(~ order) +
labs(title = "Blending multiple geometries together")
```
## Partitioning and blending lineribbons
Another case where it's useful to have finer-grained control of blending within a given
geometry is when drawing overlapping uncertainty bands. Here, we'll show how to use `blend()` with `stat_lineribbon()`
from [ggdist](https://mjskay.github.io/ggdist/)
to create overlapping gradient ribbons depicting uncertainty.
We'll fit a model:
```{r m_mpg}
m_mpg = lm(mpg ~ hp * cyl, data = mtcars)
```
And generate some confidence distributions for the mean using [distributional](https://pkg.mitchelloharawild.com/distributional/):
```{r lineribbon}
predictions = unique(mtcars[, c("cyl", "hp")])
predictions$mu_hat = with(predict(m_mpg, newdata = predictions, se.fit = TRUE),
distributional::dist_student_t(df = df, mu = fit, sigma = se.fit)
)
predictions
```
A basic plot based on examples in `vignette("freq-uncertainty-vis", package = "ggdist")` and
`vignette("lineribbon", package = "ggdist")` may have issues when lineribbons overlap:
```{r lineribbon_noblend}
predictions |>
ggplot(aes(x = hp, fill = ordered(cyl), color = ordered(cyl))) +
ggdist::stat_lineribbon(
aes(ydist = mu_hat, fill_ramp = after_stat(.width)),
.width = ppoints(40)
) +
geom_point(aes(y = mpg), data = mtcars) +
scale_fill_brewer(palette = "Set2") +
scale_color_brewer(palette = "Dark2") +
ggdist::scale_fill_ramp_continuous(range = c(1, 0)) +
labs(
title = "ggdist::stat_lineribbon()",
subtitle = "Overlapping lineribbons obscure each other.",
color = "cyl", fill = "cyl", y = "mpg"
)
```
Notice the overlap of the orange (`cyl = 6`) and purple (`cyl = 8`) lines.
If we add a `partition = cyl` aesthetic mapping, we can blend the geometries
for the different levels of `cyl` together with a `blend()` call around
`ggdist::stat_lineribbon()`.
There are many ways we could add the partition to the plot:
1. Add `partition = cyl` to the existing `aes(...)` call. However, this
leaves the partitioning information far from the call to `blend()`, so the
relationship between them is less clear.
2. Add `aes(partition = cyl)` to the `stat_lineribbon(...)` call. This is
a more localized change (better!), but will raise a warning if `stat_lineribbon()`
itself does not recognized the `partition` aesthetic.
3. Add `|> adjust(aes(partition = cyl))` after `stat_lineribbon(...)` to
add the `partition` aesthetic to it (this will bypass the warning).
4. Add `|> partition(vars(cyl))` after `stat_lineribbon(...)` to add the
`partition` aesthetic. This is an alias for the `adjust()` approach that is
intended to be clearer. It takes a specification for a partition that is
similar to `facet_wrap()`: either a one-sided formula or a call to `vars()`.
Let's try the fourth approach:
```{r lineribbon_blend}
predictions |>
ggplot(aes(x = hp, fill = ordered(cyl), color = ordered(cyl))) +
ggdist::stat_lineribbon(
aes(ydist = mu_hat, fill_ramp = after_stat(.width)),
.width = ppoints(40)
) |> partition(vars(cyl)) |> blend("multiply") +
geom_point(aes(y = mpg), data = mtcars) +
scale_fill_brewer(palette = "Set2") +
scale_color_brewer(palette = "Dark2") +
ggdist::scale_fill_ramp_continuous(range = c(1, 0)) +
labs(
title = "ggdist::stat_lineribbon() |> partition(vars(cyl)) |> blend('multiply')",
subtitle = "Overlapping lineribbons blend together independent of draw order.",
color = "cyl", fill = "cyl", y = "mpg"
)
```
Now the overlapping ribbons are blended together.
## Highlighting geoms using `copy_under()`
A common visualization technique to make a layer more salient (especially in the
presence of many other competing layers) is to add a small outline around
it. For some geometries (like `geom_point()`) this is easy; but for others (like `geom_line()`),
there's no easy way to do this without manually copying the layer.
The *ggblend* layer algebra makes this straightforward using the `adjust()` operation
combined with operator addition and multiplication. For example, given a layer
like:
```r
geom_line(linewidth = 1)
```
To add a white outline, you might want something like:
```r
geom_line(color = "white", linewidth = 2.5) + geom_line(linewidth = 1)
```
However, we'd rather not have to write the `geom_line()` specification twice
If we factor out the differences between the first and second layer, we can use
the `adjust()` operation (which lets you change the aesthetics and parameters
of a layer) along with the distributive law to factor out
`geom_line(linewidth = 1)` and write the above specification as:
```r
geom_line(linewidth = 1) * (adjust(color = "white", linewidth = 2.5) + 1)
```
The `copy_under(...)` operation, which is a synonym for `adjust(...) + 1`,
also implements this pattern:
```r
geom_line(linewidth = 1) * copy_under(color = "white", linewidth = 2.5)
```
Here's an example highlighting the fit lines from our previous lineribbon example:
```{r lineribbon_blend_highlight}
predictions |>
ggplot(aes(x = hp, fill = ordered(cyl), color = ordered(cyl))) +
ggdist::stat_ribbon(
aes(ydist = mu_hat, fill_ramp = after_stat(.width)),
.width = ppoints(40)
) |> partition(vars(cyl)) |> blend("multiply") +
geom_line(aes(y = median(mu_hat)), linewidth = 1) |> copy_under(color = "white", linewidth = 2.5) +
geom_point(aes(y = mpg), data = mtcars) +
scale_fill_brewer(palette = "Set2") +
scale_color_brewer(palette = "Dark2") +
ggdist::scale_fill_ramp_continuous(range = c(1, 0)) +
labs(
title = "geom_line() |> copy_under(color = 'white', linewidth = 2.5)",
subtitle = "Highlights the line layer without manually copying its specification.",
color = "cyl", fill = "cyl", y = "mpg"
)
```
Note that the implementation of `copy_under(...)` is simply a synonym for
`adjust(...) + 1`; we can see this if we look at `copy_under()` itself:
```{r}
copy_under()
```
In fact, not that it is particularly useful, but addition and multiplication
of layer operations is expanded appropriately:
```{r}
(adjust() + 3) * 2
```
I hesitate to imagine what that feature might be useful for...
## Compatibility with other packages
In theory *ggblend* should be compatible with other packages, though in more
complex cases (blending lists of geoms or using the `partition` aesthetic)
it is possible it may fail, as these features are a bit more hackish. I have
done some testing with a few other layer-manipulating packages---including
[gganimate](https://gganimate.com/), [ggnewscale](https://eliocamp.github.io/ggnewscale/),
and [relayer](https://github.com/clauswilke/relayer)---and they appear to be
compatible.
As a hard test, here is all three features applied to a modified version of the
Gapminder example used in the [gganimate documentation](https://gganimate.com/):
```{r gapminder, message=FALSE, warning=FALSE}
library(gganimate)
library(gapminder)
p = gapminder |>
ggplot(aes(gdpPercap, lifeExp, size = pop, color = continent)) +
list(
geom_point(show.legend = c(size = FALSE)) |> partition(vars(continent)) |> blend("multiply"),
geom_hline(yintercept = 70, linewidth = 1.5, color = "gray75")
) |> blend("hard.light") +
scale_color_manual(
# same as colorspace::lighten(continent_colors, 0.35)
values = c(
Africa = "#BE7658", Americas = "#E95866", Asia = "#7C5C86",
Europe = "#659C5D", Oceania = "#7477CA"
),
guide = guide_legend(override.aes = list(size = 4))
) +
scale_size(range = c(2, 12)) +
scale_x_log10(labels = scales::label_dollar(scale_cut = scales::cut_short_scale())) +
scale_y_continuous(breaks = seq(20, 80, by = 10)) +
labs(
title = 'Gapminder with gganimate and ggblend',
subtitle = 'Year: {frame_time}',
x = 'GDP per capita',
y = 'Life expectancy'
) +
transition_time(year) +
ease_aes('linear')
animate(p, type = "cairo", width = 600, height = 400, res = 100)
```