Skip to content

Commit

Permalink
Final comments from whole game (hadley#1347)
Browse files Browse the repository at this point in the history
  • Loading branch information
hadley authored Mar 8, 2023
1 parent 08c3cdf commit 1e488f3
Show file tree
Hide file tree
Showing 3 changed files with 5 additions and 12 deletions.
4 changes: 1 addition & 3 deletions data-import.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -434,9 +434,7 @@ sales_files
## Writing to a file {#sec-writing-to-a-file}

readr also comes with two useful functions for writing data back to disk: `write_csv()` and `write_tsv()`.
Both functions increase the chances of the output file being read back in correctly by using the standard UTF-8 encoding for strings and ISO8601 format for date-times.

The most important arguments are `x` (the data frame to save), and `file` (the location to save it).
The most important arguments to these functions are `x` (the data frame to save) and `file` (the location to save it).
You can also specify how missing values are written with `na`, and if you want to `append` to an existing file.

```{r}
Expand Down
11 changes: 3 additions & 8 deletions data-tidy.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -378,7 +378,7 @@ An alternative to `names_sep` is `names_pattern`, which you can use to extract v

Conceptually, this is only a minor variation on the simpler case you've already seen.
@fig-pivot-multiple-names shows the basic idea: now, instead of the column names pivoting into a single column, they pivot into multiple columns.
You can imagine this happening in two steps (first pivoting and then separating) but under the hood it happens in a single step because that gives better performance.
You can imagine this happening in two steps (first pivoting and then separating) but under the hood it happens in a single step because that's faster.

```{r}
#| label: fig-pivot-multiple-names
Expand Down Expand Up @@ -409,7 +409,7 @@ household

This dataset contains data about five families, with the names and dates of birth of up to two children.
The new challenge in this dataset is that the column names contain the names of two variables (`dob`, `name)` and the values of another (`child,` with values 1 or 2).
To solve this problem we again need to supply a vector to `names_to` but this time we use the special `".value"` sentinel.
To solve this problem we again need to supply a vector to `names_to` but this time we use the special `".value"` sentinel; this isn't the name of a variable but a unique value that tells `pivot_longer()` to do something different.
This overrides the usual `values_to` argument to use the first component of the pivoted column name as a variable name in the output.

```{r}
Expand All @@ -419,13 +419,10 @@ household |>
names_to = c(".value", "child"),
names_sep = "_",
values_drop_na = TRUE
) |>
mutate(
child = parse_number(child)
)
```

We again use `values_drop_na = TRUE`, since the shape of the input forces the creation of explicit missing variables (e.g. for families with only one child), and `parse_number()` to convert (e.g.) `child1` into 1.
We again use `values_drop_na = TRUE`, since the shape of the input forces the creation of explicit missing variables (e.g. for families with only one child).

@fig-pivot-names-and-values illustrates the basic idea with a simpler example.
When you use `".value"` in `names_to`, the column names in the input contribute to both values and variable names in the output.
Expand Down Expand Up @@ -519,8 +516,6 @@ df |>
)
```

The connection between the position of the row in the input and the cell in the output is weaker than in `pivot_longer()` because the rows and columns in the output are primarily determined by the values of variables, not their locations.

To begin the process `pivot_wider()` needs to first figure out what will go in the rows and columns.
Finding the new column names is easy: it's just the unique values of `name`.

Expand Down
2 changes: 1 addition & 1 deletion workflow-scripts.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -198,7 +198,7 @@ knitr::include_graphics("diagrams/rstudio/clean-slate.png", dpi = 270)

There is a great pair of keyboard shortcuts that will work together to make sure you've captured the important parts of your code in the editor:

1. Press Cmd/Ctrl + Shift + F10 to restart R.
1. Press Cmd/Ctrl + Shift + 0 to restart R.
2. Press Cmd/Ctrl + Shift + S to re-run the current script.

We collectively use this pattern hundreds of times a week.
Expand Down

0 comments on commit 1e488f3

Please sign in to comment.