Skip to content

nekoemperor/spotify-ETL-job

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spotify API ETL Job

Run Spotify API ETL Job in 15 minutes

Table of contents

About

This repo covers the entire ETL process based on Spotify API data.

We are going to build a simple data pipeline (or in other words, a data feed) that downloads Spotify data on what songs we've listened to in the last 24 hours, and saves that data in a SQLite database.

We will schedule this pipeline to run daily. After a few months we will end up with our own, private Spotify played tracks history dataset!

The Demo

Check out the demo: (still run locally, later to be run in docker)

Installation

  • Clone this repo: Github repo to clone
  • Install & Setup Airflow: Airflow Quickstart
    • change the content of airflow.cfg in ~/airflow, e.g., from dags_folder = /home/<USER>/airflow/dags to dags_folder = /home/<USER>/spotify-ETL-job/dags
  • Add your Spotify Username:
    • Open your spotify app, click your profile or username, then account. From there copy the 10-digit username, and
    • Paste it in spotify_etl.py line 44:
  • Add your Spotify API token:
  • Open 2 WSL terminals,
    • On terminal 1, run airflow webserver --port 8080
    • On terminal 2, run airflow scheduler

About

Run Spotify API ETL Job in 15 minutes

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages