Course content for the R and Big Data class 2019 at UII.
-
course R and Big Data
- course introduction
- basics
- recap tidyverse
- R web scraping
- R web scraping advanced
- introduction to distributed computing (hadoop & spark)
- connect R and spark (quickstart in the cloud)
- databricks community edition
- import sample notebook https://docs.databricks.com/_static/notebooks/sparklyr.html
- connecting R and spark
- practical intro to dataframes
- partitioning, UDF and joins
- extensions (not part, only mentioned briefly)
- machine learning
- streaming
- further (advanced) content if time permits
- capstone project