Skip to content

Archiving and retrieving data with piggyback

cgpu edited this page May 13, 2020 · 1 revision

Retrieving files hosted in a GitHub release using ropensci/piggyback

Some of the Jupyter notebooks of this repo require as input data hosted in a GitHub release. We have previously archived these results that have been generated either from Nextflow pipelines or other notebooks. using the method described by the author of the R package ropensci/piggyback.

For using the ropensci/piggyback with private repositories, it is required that a GITHUB_TOKEN is stored as a variable in the R environment in which one is working. To generate such a token with sensible default permissions, the R package usethis has a convenient function named browse_github_token()

# intall.packages("usethis")

This will redirect you to GitHub to create your own GitHub token. Once you have the token, you can use it to set up .Renviron by typing the following:

Sys.setenv(GITHUB_TOKEN = "youractualtokenindoublequotes")

Then you are ready to use the function piggyback::pb_download() to retrieve the file of interest hosted in a GitHub release.


  • Please be mindful and do not use the .token argument in the piggyback functions to set your token, as you might forget and push your code, along with your private GITHUB_TOKEN to GitHub.

  • Do not write directly in Jupyter cells the command for configuring your GITHUB_TOKEN.

  • If it happens that by mistake you have push your code that includes your GITHUB_TOKEN to GitHub, pleasse invalidate the token that has been exposed by accessing this link and clicking Delete.