Skip to content

Commit

Permalink
homework 3
Browse files Browse the repository at this point in the history
  • Loading branch information
clauswilke committed Jan 29, 2025
1 parent 9e7f63b commit 1bb676a
Show file tree
Hide file tree
Showing 7 changed files with 155 additions and 1 deletion.
Binary file added _site/assignments/HW3.pdf
Binary file not shown.
71 changes: 71 additions & 0 deletions _site/assignments/HW3.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
---
title: "Homework 3"
format:
typst:
fig-format: png
fig-dpi: 300
fig-width: 6
fig-height: 4
---

```{r}
#| echo: false
#| message: false
# !! Do not edit this code chunk !!
library(tidyverse)
library(palmerpenguins)
penguins2 <- na.omit(penguins)
# data prep:
OH_pop <- midwest |>
filter(state == "OH") |>
arrange(desc(poptotal)) |>
mutate(row = row_number()) |>
filter(poptotal >= 100000) |>
select(c(county, poptotal))
```

**This homework is due on Feb. 6, 2025 at 11:00pm. Please submit as a pdf file on Canvas.**

**Problem 1: (8 pts)** For this Problem you will be working with the `penguins2` dataset which is equivalent to `penguins` but with `NA` values removed.

```{r}
penguins2
```

Use ggplot to make a histogram of the `body_mass_g` column. Manually choose appropriate values for `binwidth` and `center`. Explain your choice of values in 2-3 sentences.

```{r}
# Your code goes here.
```

*Your explanation goes here.*

**Problem 2: (6 pts)** For Problems 2 and 3, you will work with the dataset `OH_pop` that contains Ohio state demographics and has been derived from the `midwest` dataset provided by **ggplot2**. See here for details of the original dataset: https://ggplot2.tidyverse.org/reference/midwest.html. `OH_pop` contains two columns: `county` and `poptotal` (the county's total population), and it only contains counties with at least 100,000 inhabitants.

```{r}
OH_pop
```

Create a plot that satisfies the following two requirements:

(a) Use ggplot to make a scatter plot of `county` vs total population (column `poptotal`) and order the counties by the total population.

(b) Rename the axes and set appropriate limits, breaks and labels. Note: Do not use `xlab()` or `ylab()` to label the axes.

```{r}
# Your code goes here.
```

**Problem 2: (6 pts)**

Modify the plot from Problem 2 so it satisfies the following two requirements:

(a) Change the scale for `poptotal` to logarithmic.

(b) Adjust the limits, breaks, and labels so they are appropriate for the logarithmic scale.

```{r}
# Your code goes here.
```
7 changes: 7 additions & 0 deletions _site/schedule.html
Original file line number Diff line number Diff line change
Expand Up @@ -386,6 +386,13 @@ <h3 class="anchored" data-anchor-id="homework-2-due-jan-30-2025">Homework 2 (due
</section>
<section id="homework-3-due-feb-6-2025" class="level3">
<h3 class="anchored" data-anchor-id="homework-3-due-feb-6-2025">Homework 3 (due Feb 6, 2025)</h3>
<p class="nospace">
Materials:
</p>
<ul>
<li><a href="assignments/HW3.qmd">Quarto template</a></li>
<li><a href="assignments/HW3.pdf">PDF</a></li>
</ul>
</section>
<section id="homework-4-due-feb-27-2025" class="level3">
<h3 class="anchored" data-anchor-id="homework-4-due-feb-27-2025">Homework 4 (due Feb 27, 2025)</h3>
Expand Down
2 changes: 1 addition & 1 deletion _site/search.json
Original file line number Diff line number Diff line change
Expand Up @@ -1404,7 +1404,7 @@
"href": "schedule.html#homeworks",
"title": "SDS 366 Schedule Spring 2025",
"section": "Homeworks",
"text": "Homeworks\nAll homeworks are due by 11:00pm on the day they are due. Homeworks need to be submitted as pdf files on Canvas.\n\nHomework 1 (due Jan 23, 2025)\n\nMaterials:\n\n\nQuarto template\nPDF\n\n\n\nHomework 2 (due Jan 30, 2025)\n\nMaterials:\n\n\nQuarto template\nPDF\n\n\n\nHomework 3 (due Feb 6, 2025)\n\n\nHomework 4 (due Feb 27, 2025)\n\n\nHomework 5 (due Mar 6, 2025)\n\n\nHomework 6 (due Apr 3, 2025)\n\n\nHomework 7 (due Apr 10, 2025)"
"text": "Homeworks\nAll homeworks are due by 11:00pm on the day they are due. Homeworks need to be submitted as pdf files on Canvas.\n\nHomework 1 (due Jan 23, 2025)\n\nMaterials:\n\n\nQuarto template\nPDF\n\n\n\nHomework 2 (due Jan 30, 2025)\n\nMaterials:\n\n\nQuarto template\nPDF\n\n\n\nHomework 3 (due Feb 6, 2025)\n\nMaterials:\n\n\nQuarto template\nPDF\n\n\n\nHomework 4 (due Feb 27, 2025)\n\n\nHomework 5 (due Mar 6, 2025)\n\n\nHomework 6 (due Apr 3, 2025)\n\n\nHomework 7 (due Apr 10, 2025)"
},
{
"objectID": "schedule.html#projects",
Expand Down
Binary file added assignments/HW3.pdf
Binary file not shown.
71 changes: 71 additions & 0 deletions assignments/HW3.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
---
title: "Homework 3"
format:
typst:
fig-format: png
fig-dpi: 300
fig-width: 6
fig-height: 4
---

```{r}
#| echo: false
#| message: false
# !! Do not edit this code chunk !!
library(tidyverse)
library(palmerpenguins)
penguins2 <- na.omit(penguins)
# data prep:
OH_pop <- midwest |>
filter(state == "OH") |>
arrange(desc(poptotal)) |>
mutate(row = row_number()) |>
filter(poptotal >= 100000) |>
select(c(county, poptotal))
```

**This homework is due on Feb. 6, 2025 at 11:00pm. Please submit as a pdf file on Canvas.**

**Problem 1: (8 pts)** For this Problem you will be working with the `penguins2` dataset which is equivalent to `penguins` but with `NA` values removed.

```{r}
penguins2
```

Use ggplot to make a histogram of the `body_mass_g` column. Manually choose appropriate values for `binwidth` and `center`. Explain your choice of values in 2-3 sentences.

```{r}
# Your code goes here.
```

*Your explanation goes here.*

**Problem 2: (6 pts)** For Problems 2 and 3, you will work with the dataset `OH_pop` that contains Ohio state demographics and has been derived from the `midwest` dataset provided by **ggplot2**. See here for details of the original dataset: https://ggplot2.tidyverse.org/reference/midwest.html. `OH_pop` contains two columns: `county` and `poptotal` (the county's total population), and it only contains counties with at least 100,000 inhabitants.

```{r}
OH_pop
```

Create a plot that satisfies the following two requirements:

(a) Use ggplot to make a scatter plot of `county` vs total population (column `poptotal`) and order the counties by the total population.

(b) Rename the axes and set appropriate limits, breaks and labels. Note: Do not use `xlab()` or `ylab()` to label the axes.

```{r}
# Your code goes here.
```

**Problem 2: (6 pts)**

Modify the plot from Problem 2 so it satisfies the following two requirements:

(a) Change the scale for `poptotal` to logarithmic.

(b) Adjust the limits, breaks, and labels so they are appropriate for the logarithmic scale.

```{r}
# Your code goes here.
```
5 changes: 5 additions & 0 deletions schedule.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,11 @@ All homeworks are due by 11:00pm on the day they are due. Homeworks need to be s

### Homework 3 (due Feb 6, 2025)

<p class="nospace">Materials:</p>

- [Quarto template](assignments/HW3.qmd)
- [PDF](assignments/HW3.pdf)

### Homework 4 (due Feb 27, 2025)

### Homework 5 (due Mar 6, 2025)
Expand Down

0 comments on commit 1bb676a

Please sign in to comment.