Skip to content

Commit

Permalink
Make syntax for multi-line fig-cap and fig-alt consistent
Browse files Browse the repository at this point in the history
  • Loading branch information
mine-cetinkaya-rundel committed May 26, 2023
1 parent cbb5b1b commit 205c992
Show file tree
Hide file tree
Showing 27 changed files with 227 additions and 227 deletions.
40 changes: 20 additions & 20 deletions EDA.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ We'll start our exploration by visualizing the distribution of weights (`carat`)
Since `carat` is a numerical variable, we can use a histogram:

```{r}
#| fig-alt: >
#| fig-alt: |
#| A histogram of carats of diamonds, with the x-axis ranging from 0 to 4.5
#| and the y-axis ranging from 0 to 30000. The distribution is right skewed
#| with very few diamonds in the bin centered at 0, almost 30000 diamonds in
Expand Down Expand Up @@ -117,7 +117,7 @@ To turn this information into useful questions, look for anything unexpected:
Let's take a look at the distribution of `carat` for smaller diamonds.

```{r}
#| fig-alt: >
#| fig-alt: |
#| A histogram of carats of diamonds, with the x-axis ranging from 0 to 3 and
#| the y-axis ranging from 0 to roughly 2500. The binwidth is quite narrow
#| (0.01), resulting in a very large number of skinny bars. The distribution
Expand Down Expand Up @@ -161,7 +161,7 @@ For example, take the distribution of the `y` variable from the diamonds dataset
The only evidence of outliers is the unusually wide limits on the x-axis.

```{r}
#| fig-alt: >
#| fig-alt: |
#| A histogram of lengths of diamonds. The x-axis ranges from 0 to 60 and
#| the y-axis ranges from 0 to 12000. There is a peak around 5, and the
#| data appear to be completely clustered around the peak.
Expand All @@ -174,7 +174,7 @@ There are so many observations in the common bins that the rare bins are very sh
To make it easy to see the unusual values, we need to zoom to small values of the y-axis with `coord_cartesian()`:

```{r}
#| fig-alt: >
#| fig-alt: |
#| A histogram of lengths of diamonds. The x-axis ranges from 0 to 60 and the
#| y-axis ranges from 0 to 50. There is a peak around 5, and the data
#| appear to be completely clustered around the peak. Other than those data,
Expand Down Expand Up @@ -270,7 +270,7 @@ It's not obvious where you should plot missing values, so ggplot2 doesn't includ
```{r}
#| dev: "png"
#| fig-alt: >
#| fig-alt: |
#| A scatterplot of widths vs. lengths of diamonds. There is a strong,
#| linear association between the two variables. All but one of the diamonds
#| has length greater than 3. The one outlier has a length of 0 and a width
Expand All @@ -297,7 +297,7 @@ You can do this by making a new variable, using `is.na()` to check if `dep_time`
[^eda-1]: Remember that when we need to be explicit about where a function (or dataset) comes from, we'll use the special form `package::function()` or `package::dataset`.

```{r}
#| fig-alt: >
#| fig-alt: |
#| A frequency polygon of scheduled departure times of flights. Two lines
#| represent flights that are cancelled and not cancelled. The x-axis ranges
#| from 0 to 25 minutes and the y-axis ranges from 0 to 10000. The number of
Expand Down Expand Up @@ -340,7 +340,7 @@ The best way to spot covariation is to visualize the relationship between two or
For example, let's explore how the price of a diamond varies with its quality (measured by `cut`) using `geom_freqpoly()`:

```{r}
#| fig-alt: >
#| fig-alt: |
#| A frequency polygon of prices of diamonds where each cut of carat (Fair,
#| Good, Very Good, Premium, and Ideal) is represented with a different color
#| line. The x-axis ranges from 0 to 30000 and the y-axis ranges from 0 to
Expand All @@ -361,7 +361,7 @@ To make the comparison easier we need to swap what is displayed on the y-axis.
Instead of displaying count, we'll display the **density**, which is the count standardized so that the area under each frequency polygon is one.

```{r}
#| fig-alt: >
#| fig-alt: |
#| A frequency polygon of densities of prices of diamonds where each cut of
#| carat (Fair, Good, Very Good, Premium, and Ideal) is represented with a
#| different color line. The x-axis ranges from 0 to 20000. The lines overlap
Expand All @@ -382,7 +382,7 @@ But maybe that's because frequency polygons are a little hard to interpret - the
A visually simpler plot for exploring this relationship is using side-by-side boxplots.

```{r}
#| fig-alt: >
#| fig-alt: |
#| Side-by-side boxplots of prices of diamonds by cut. The distribution of
#| prices is right skewed for each cut (Fair, Good, Very Good, Premium, and
#| Ideal). The medians are close to each other, with the median for Ideal
Expand All @@ -404,7 +404,7 @@ For example, take the `class` variable in the `mpg` dataset.
You might be interested to know how highway mileage varies across classes:

```{r}
#| fig-alt: >
#| fig-alt: |
#| Side-by-side boxplots of highway mileages of cars by class. Classes are
#| on the x-axis (2seaters, compact, midsize, minivan, pickup, subcompact,
#| and suv).
Expand All @@ -416,7 +416,7 @@ ggplot(mpg, aes(x = class, y = hwy)) +
To make the trend easier to see, we can reorder `class` based on the median value of `hwy`:

```{r}
#| fig-alt: >
#| fig-alt: |
#| Side-by-side boxplots of highway mileages of cars by class. Classes are
#| on the x-axis and ordered by increasing median highway mileage (pickup,
#| suv, minivan, 2seater, subcompact, compact, and midsize).
Expand All @@ -429,7 +429,7 @@ If you have long variable names, `geom_boxplot()` will work better if you flip i
You can do that by exchanging the x and y aesthetic mappings.

```{r}
#| fig-alt: >
#| fig-alt: |
#| Side-by-side boxplots of highway mileages of cars by class. Classes are
#| on the y-axis and ordered by increasing median highway mileage.
Expand Down Expand Up @@ -468,7 +468,7 @@ To visualize the covariation between categorical variables, you'll need to count
One way to do that is to rely on the built-in `geom_count()`:

```{r}
#| fig-alt: >
#| fig-alt: |
#| A scatterplot of color vs. cut of diamonds. There is one point for each
#| combination of levels of cut (Fair, Good, Very Good, Premium, and Ideal)
#| and color (D, E, F, G, G, I, and J). The sizes of the points represent
Expand All @@ -492,7 +492,7 @@ diamonds |>
Then visualize with `geom_tile()` and the fill aesthetic:

```{r}
#| fig-alt: >
#| fig-alt: |
#| A tile plot of cut vs. color of diamonds. Each tile represents a
#| cut/color combination and tiles are colored according to the number of
#| observations in each tile. There are more Ideal diamonds than other cuts,
Expand Down Expand Up @@ -528,7 +528,7 @@ The relationship is exponential.

```{r}
#| dev: "png"
#| fig-alt: >
#| fig-alt: |
#| A scatterplot of price vs. carat. The relationship is positive, somewhat
#| strong, and exponential.
Expand All @@ -543,7 +543,7 @@ You've already seen one way to fix the problem: using the `alpha` aesthetic to a

```{r}
#| dev: "png"
#| fig-alt: >
#| fig-alt: |
#| A scatterplot of price vs. carat. The relationship is positive, somewhat
#| strong, and exponential. The points are transparent, showing clusters where
#| the number of points is higher than other areas, The most obvious clusters
Expand All @@ -566,7 +566,7 @@ You will need to install the hexbin package to use `geom_hex()`.
```{r}
#| layout-ncol: 2
#| fig-width: 3
#| fig-alt: >
#| fig-alt: |
#| Plot 1: A binned density plot of price vs. carat. Plot 2: A hexagonal bin
#| plot of price vs. carat. Both plots show that the highest density of
#| diamonds have low carats and low prices.
Expand All @@ -584,7 +584,7 @@ Then you can use one of the techniques for visualizing the combination of a cate
For example, you could bin `carat` and then for each group, display a boxplot:

```{r}
#| fig-alt: >
#| fig-alt: |
#| Side-by-side box plots of price by carat. Each box plot represents diamonds
#| that are 0.1 carats apart in weight. The box plots show that as carat
#| increases the median price increases as well. Additionally, diamonds with
Expand Down Expand Up @@ -668,7 +668,7 @@ Then, we exponentiate the residuals to put them back in the scale of raw prices.
```{r}
#| message: false
#| dev: "png"
#| fig-alt: >
#| fig-alt: |
#| A scatterplot of residuals vs. carat of diamonds. The x-axis ranges from 0
#| to 5, the y-axis ranges from 0 to almost 4. Much of the data are clustered
#| around low values of carat and residuals. There is a clear, curved pattern
Expand All @@ -695,7 +695,7 @@ ggplot(diamonds_aug, aes(x = carat, y = .resid)) +
Once you've removed the strong relationship between carat and price, you can see what you expect in the relationship between cut and price: relative to their size, better quality diamonds are more expensive.

```{r}
#| fig-alt: >
#| fig-alt: |
#| Side-by-side box plots of residuals by cut. The x-axis displays the various
#| cuts (Fair to Ideal), the y-axis ranges from 0 to almost 5. The medians are
#| quite similar, between roughly 0.75 to 1.25. Each of the distributions of
Expand Down
4 changes: 2 additions & 2 deletions base-R.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -346,11 +346,11 @@ If this pepper shaker is your list `pepper`, then, `pepper[1]` is a pepper shake
#| label: fig-pepper
#| echo: false
#| out-width: "100%"
#| fig-cap: >
#| fig-cap: |
#| (Left) A pepper shaker that Hadley once found in his hotel room.
#| (Middle) `pepper[1]`.
#| (Right) `pepper[[1]]`
#| fig-alt: >
#| fig-alt: |
#| Three photos. On the left is a photo of a glass pepper shaker. Instead of
#| the pepper shaker containing pepper, it contains a single packet of pepper.
#| In the middle is a photo of a single packet of pepper. On the right is a
Expand Down
4 changes: 2 additions & 2 deletions communicate.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,11 @@ However, it doesn't matter how great your analysis is unless you can explain it
```{r}
#| label: fig-ds-communicate
#| echo: false
#| fig-cap: >
#| fig-cap: |
#| Communication is the final part of the data science process; if you
#| can't communicate your results to other humans, it doesn't matter how
#| great your analysis is.
#| fig-alt: >
#| fig-alt: |
#| A diagram displaying the data science cycle with
#| communicate highlighed in blue.
#| out.width: NULL
Expand Down
Loading

0 comments on commit 205c992

Please sign in to comment.