Use curl::multi_download()

cc @djnavarro Fixes hadley#1226
PaulAss · Jan 23, 2023 · f090584 · f090584
1 parent 707b332
commit f090584
Show file tree

Hide file tree

Showing 2 changed files with 8 additions and 5 deletions.
diff --git a/DESCRIPTION b/DESCRIPTION
@@ -12,6 +12,7 @@ Depends:
 Imports:
     arrow,
     babynames,
+    curl (>= 5.0.0),
     dplyr,
     duckdb,
     gapminder,

diff --git a/arrow.qmd b/arrow.qmd
@@ -53,16 +53,18 @@ We begin by getting a dataset worthy of these tools: a data set of item checkout
 This dataset contains 41,389,465 rows that tell you how many times each book was checked out each month from April 2005 to October 2022.
 
 The following code will get you a cached copy of the data.
-The data is a 9GB CSV file, so it will take some time to download: simply getting the data is often the first challenge!
+The data is a 9GB CSV file, so it will take some time to download.
+I highly recommend using `curl::multidownload()` to get very large files as it's built for exactly this purpose: it gives you a progress bar and it can resume the download if its interrupted.
 
 ```{r}
 #| eval: false
 dir.create("data", showWarnings = FALSE)
-url <- "https://r4ds.s3.us-west-2.amazonaws.com/seattle-library-checkouts.csv"
 
-# Default timeout is 60s; bump it up to an hour
-options(timeout = 60 * 60)
-download.file(url, "data/seattle-library-checkouts.csv")
+curl::multi_download(
+  "https://r4ds.s3.us-west-2.amazonaws.com/seattle-library-checkouts.csv",
+  "data/seattle-library-checkouts.csv",
+  resume = TRUE
+)
 ```
 
 ## Opening a dataset