Skip to content

Commit

Permalink
Grammar (hadley#271)
Browse files Browse the repository at this point in the history
  • Loading branch information
jonpage authored and hadley committed Aug 16, 2016
1 parent 68a1d54 commit fc8ea9d
Showing 1 changed file with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions EDA.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ Every variable has its own pattern of variation, which can reveal interesting in

### Visualising distributions

How you visualise the distribution of a variable will depend on whether the variable is categorical or continuous. A variable is **categorical** if it can only take one of small set of values. In R, categorical variables are usually saved as factors or character vectors. To examine the distribution of a categorical variable, use a bar chart:
How you visualise the distribution of a variable will depend on whether the variable is categorical or continuous. A variable is **categorical** if it can only take one of a small set of values. In R, categorical variables are usually saved as factors or character vectors. To examine the distribution of a categorical variable, use a bar chart:

```{r}
ggplot(data = diamonds) +
Expand Down Expand Up @@ -130,7 +130,7 @@ ggplot(data = smaller, mapping = aes(x = carat, colour = cut)) +
geom_freqpoly(binwidth = 0.1)
```

There are a few challenges with this type of plot, which we will come back to in [visualisation a categorical and a continuous variable](#cat-cont).
There are a few challenges with this type of plot, which we will come back to in [visualising a categorical and a continuous variable](#cat-cont).

Now that you can visualise variation, what should you look for in your plots? And what type of follow-up questions should you ask? I've put together a list below of the most useful types of information that you will find in your graphs, along with some follow up questions for each type of information. The key to asking good follow up questions will be to rely on your **curiosity** (What do you want to learn more about?) as well as your **skepticism** (How could this be misleading?).

Expand Down Expand Up @@ -582,7 +582,7 @@ ggplot(faithful, aes(eruptions)) +
geom_freqpoly(binwidth = 0.25)
```

Sometimes we'll turn the end of pipeline of data transformation into a plot. Watch for the transition from `%>%` to `+`. I wish this transition wasn't necessary but unfortunately ggplot2 was created before the pipe was discovered.
Sometimes we'll turn the end of a pipeline of data transformation into a plot. Watch for the transition from `%>%` to `+`. I wish this transition wasn't necessary but unfortunately ggplot2 was created before the pipe was discovered.

```{r, eval = FALSE}
diamonds %>%
Expand All @@ -591,4 +591,4 @@ diamonds %>%
geom_tile()
```

If you want learn more about ggplot2, I'd highly recommend grabbing a copy of the ggplot2 book: <https://amzn.com/331924275X>. It's been recently updated, so includes dplyr and tidyr code, and has much more space to explore all the facets of visualisation. Unfortunately the book isn't generally available for free, but if you have a connection to a university you can probably get an electronic version for free through SpringerLink.
If you want learn more about ggplot2, I'd highly recommend grabbing a copy of the ggplot2 book: <https://amzn.com/331924275X>. It's been recently updated, so it includes dplyr and tidyr code, and has much more space to explore all the facets of visualisation. Unfortunately the book isn't generally available for free, but if you have a connection to a university you can probably get an electronic version for free through SpringerLink.

0 comments on commit fc8ea9d

Please sign in to comment.