Skip to content

Commit

Permalink
There's something about .by (hadley#1351)
Browse files Browse the repository at this point in the history
  • Loading branch information
hadley authored Mar 9, 2023
1 parent 64841cc commit 8c03ddc
Showing 1 changed file with 33 additions and 0 deletions.
33 changes: 33 additions & 0 deletions data-transform.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -699,6 +699,39 @@ daily |>

You get a single row back because dplyr treats all the rows in an ungrouped data frame as belonging to one group.

### `.by`

dplyr 1.1.0 includes an new, experimental, syntax for per-operation grouping, the `.by` argument.
`group_by()` and `ungroup()` aren't going away, but you can now also use the `.by` argument to group within a single operation:

```{r}
#| results: false
flights |>
summarize(
delay = mean(dep_delay, na.rm = TRUE),
n = n(),
.by = month
)
```

Or if you want to group by multiple variables:

```{r}
#| results: false
flights |>
summarize(
delay = mean(dep_delay, na.rm = TRUE),
n = n(),
.by = c(origin, dest)
)
```

`.by` works with all verbs and has the advantage that you don't need to use the `.groups` argument to suppress the grouping message or `ungroup()` when you're done.

We didn't focus on this syntax in this chapter because it was very new when wrote the book.
We did want to mention it because we think it has a lot of promise and it's likely to be quite popular.
You can learn more about it in the [dplyr 1.1.0 blog post](https://www.tidyverse.org/blog/2023/02/dplyr-1-1-0-per-operation-grouping/).

### Exercises

1. Which carrier has the worst average delays?
Expand Down

0 comments on commit 8c03ddc

Please sign in to comment.