Skip to content

Commit

Permalink
Add exercises. Update adverbs
Browse files Browse the repository at this point in the history
  • Loading branch information
hadley committed Nov 19, 2015
1 parent 269867d commit 445b1a0
Showing 1 changed file with 35 additions and 6 deletions.
41 changes: 35 additions & 6 deletions lists.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -228,6 +228,24 @@ compute_summary(x, mean)

Instead of hardcoding the summary function, we allow it to vary, by adding an addition argument that is a function. It can take a while to wrap your head around this, but it's very powerful technique. This is one of the reasons that R is known as a "functional" programming language.

### Exercises

1. Read the documentation for `apply()`. In the 2d case, what two for loops
does it generalise?

1. It's common to see for loops that don't preallocate the output and instead
increase the length of a vector at each step:

```{r}
results <- vector("integer", 0)
for (i in seq_along(x)) {
results <- results(c, results)
}
results
```
How does this impact performance?
## The map functions
This pattern of looping over a list and doing something to each element is so common that the purrr package provides a family of functions to do it for you. Each function always returns the same type of output so there are six variations based on what sort of result you want:
Expand Down Expand Up @@ -304,14 +322,25 @@ If you're familiar with the apply family of functions in base R, you might have
`map_lgl(df, is.numeric)`. One advantage to `vapply()` over the map
functions is that it can also produce matrices.
### Exercises
1. How can you determine which columns in a data frame are factors?
(Hint: data frames are lists.)
1. What happens when you use the map functions on vectors that aren't lists?
What does `map(1:5, runif)` do? Why?
1. What does `map(-2:2, rnorm, n = 5)` do. Why?
## Pipelines
`map()` is particularly useful when constructing more complex transformations because it both inputs and outputs a list. That makes it well suited for solving a problem a piece at a time. For example, imagine you want to fit a linear model to each individual in a dataset.
`map()` is particularly useful when constructing more complex transformations because it both inputs and outputs a list. That makes it well suited for solving a problem a piece at a time.
Let's start by working through the whole process on the complete dataset. It's always a good idea to start simple (with a single object), and figure out the basic workflow. Then you can generalise up to the harder problem of applying the same steps to multiple models.
TODO: find interesting dataset
For example, imagine you want to fit a linear model to each individual in a dataset. Let's start by working through the whole process on the complete dataset. It's always a good idea to start simple (with a single object), and figure out the basic workflow. Then you can generalise up to the harder problem of applying the same steps to multiple models.
You could start by creating a list where each element is a data frame for a different person:
```{r}
Expand Down Expand Up @@ -407,12 +436,12 @@ Other predicate functionals: `head_while()`, `tail_while()`, `some()`, `every()`

When you start doing many operations with purrr, you'll soon discover that not everything always succeeds. For example, you might be fitting a bunch of more complicated models, and not every model will converge. How do you ensure that one bad apple doesn't ruin the whole barrel?

Dealing with errors is fundamentally painful because errors are sort of a side-channel to the way that functions usually return values. The best way to handle them is to turn them into a regular output with the `safe()` function. This function is similar to the `try()` function in base R, but instead of sometimes returning the original output and sometimes returning a error, `safe()` always returns the same type of object: a list with elements `result` and `error`. For any given run, one will always be `NULL`, but because the structure is always the same its easier to deal with.
Dealing with errors is fundamentally painful because errors are sort of a side-channel to the way that functions usually return values. The best way to handle them is to turn them into a regular output with the `safely()` function. This function is similar to the `try()` function in base R, but instead of sometimes returning the original output and sometimes returning a error, `safe()` always returns the same type of object: a list with elements `result` and `error`. For any given run, one will always be `NULL`, but because the structure is always the same its easier to deal with.

Let's illustrate this with a simple example: `log()`:

```{r}
safe_log <- safe(log)
safe_log <- safely(log)
str(safe_log(10))
str(safe_log("a"))
```
Expand Down Expand Up @@ -459,10 +488,10 @@ dplyr::filter(all, is_ok)

Other related functions:

* `maybe()`: if you don't care about the error message, and instead
* `possibly()`: if you don't care about the error message, and instead
just want a default value on failure.

* `outputs()`: does a similar job but for other outputs like printed
* `quietly()`: does a similar job but for other outputs like printed
ouput, messages, and warnings.

Challenge: read all the csv files in this directory. Which ones failed
Expand Down

0 comments on commit 445b1a0

Please sign in to comment.