Data pipelines from re-usable components
-
Updated
Mar 30, 2023 - Python
Data pipelines from re-usable components
Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.
The open-source Useful SDK. One python decorator in the Useful library allows for full observability of Python functions within an ETL.
A project structure for doing and sharing data engineer work.
e-Portfolio showcasing my personal projects.
Build ETL piplines on AirFlow to load data from BigQuery and store it in MySQL
AutoDS-Prep automates the data pre-processing step of Data Science Projects.
An extension that registers all pharmacies in Argentina.
A deployed machine learning model that has the capability to automatically classify the incoming disaster messages into related 36 categories. Project developed as a part of Udacity's Data Science Nanodegree program.
JSON-driven ETL pipeline framework prototype
This repo contains the DAGs that run on my local Airflow environment. I use the local environment to test my DAGs before deploying them to virtual machines via Kubernetes
Weaving together different threads (services like image/audio converse, ETL services, etc.) to enable the World Wide Flow
A Python and Spark based ETL framework. While it operates within speed limits that is framework and standards, but offers boundless possibilities.
Add a description, image, and links to the etl-pipelines topic page so that developers can more easily learn about it.
To associate your repository with the etl-pipelines topic, visit your repo's landing page and select "manage topics."