Skip to content

Data pipeline for music streaming platform with Apache airflow

Notifications You must be signed in to change notification settings

donsolana/sparkify-dag

Folders and files

NameName
Last commit message
Last commit date

Latest commit

8dfe01e · Jan 19, 2024

History

12 Commits
Mar 7, 2022
Mar 7, 2022
Jan 19, 2024
Mar 7, 2022

Repository files navigation

Sparkify-dag

Data pipeline for music streaming platform with Apache airflow

Introduction

This project contains a pipeline for a streaming platform. It is an automated version of the Red-Hot project in another repository on this account.

Structure

1. Dag: Contains the main Directed Acyclic Graph(DAG) definition.

2. Plugins: Contains definitions of custom Airflow Operators and helper files

I. helper files

a. Init file: This is part of airflow's standard structure when using helper scripts, and allows the script to be called upon in the dag definition.
b. SQL queries: This document contains SQL insert statements for the Redshift warehouse.

II. Operators: Operators are re-useable elements in the dags, they are designed as Python classes that inherit properties of a base Airflow class.

About

Data pipeline for music streaming platform with Apache airflow

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages