There's something about .by (hadley#1351)

Fixes hadley#1242
ebailey78 · Mar 9, 2023 · 8c03ddc · 8c03ddc
1 parent 64841cc
commit 8c03ddc
Showing 1 changed file with 33 additions and 0 deletions.
diff --git a/data-transform.qmd b/data-transform.qmd
@@ -699,6 +699,39 @@ daily |>
 
 You get a single row back because dplyr treats all the rows in an ungrouped data frame as belonging to one group.
 
+### `.by`
+
+dplyr 1.1.0 includes an new, experimental, syntax for per-operation grouping, the `.by` argument.
+`group_by()` and `ungroup()` aren't going away, but you can now also use the `.by` argument to group within a single operation:
+
+```{r}
+#| results: false
+flights |> 
+  summarize(
+    delay = mean(dep_delay, na.rm = TRUE), 
+    n = n(),
+    .by = month
+  )
+```
+
+Or if you want to group by multiple variables:
+
+```{r}
+#| results: false
+flights |> 
+  summarize(
+    delay = mean(dep_delay, na.rm = TRUE), 
+    n = n(),
+    .by = c(origin, dest)
+  )
+```
+
+`.by` works with all verbs and has the advantage that you don't need to use the `.groups` argument to suppress the grouping message or `ungroup()` when you're done.
+
+We didn't focus on this syntax in this chapter because it was very new when wrote the book.
+We did want to mention it because we think it has a lot of promise and it's likely to be quite popular.
+You can learn more about it in the [dplyr 1.1.0 blog post](https://www.tidyverse.org/blog/2023/02/dplyr-1-1-0-per-operation-grouping/).
+
 ### Exercises
 
 1.  Which carrier has the worst average delays?