-
Data-Warehouse-Project Public
This project builds a modern data warehouse on Google BigQuery using dbt. It transforms raw data from the Wide World Importers sample dataset into a star schema, featuring a central fact table for …
Shell UpdatedFeb 18, 2025 -
End to end Data Analytics Project with Microsoft Fabric
Jupyter Notebook UpdatedFeb 15, 2025 -
cdc-pipeline Public
A Change Data Capture (CDC) pipeline using PostgreSQL, MinIO, ClickHouse, and Airflow. It syncs data from PostgreSQL to ClickHouse via MinIO, supports batch processing, incremental updates, and SCD…
Python UpdatedDec 18, 2024 -
-
-
NYC-Taxi-pipeline Public
Building Data Lakehouse by open source technology. Support end to end data pipeline, from source data on AWS S3 to Lakehouse, visualize.
-
Fitness_data_streaming Public
Real-time data processing using Delta Pipeline Architecture, use Databricks Lakehouse to store Delta tables.
Python UpdatedSep 12, 2024 -
-
-
-
-
-
-
-
-
-
-
-
airbyte-haravan-source Public
Forked from airbytehq/airbyteData integration platform for ELT pipelines from APIs, databases & files to warehouses & lakes.
Python Other UpdatedNov 23, 2023 -