ETL and API integration
Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
A lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
An orchestration platform for the development, production, and observation of data assets.
A curated list of awesome ETL frameworks, libraries, and software.
Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
A typescript based enterprise service bus framework based on enterprise integration patterns
Stupidly simple Python Package to build front end dashboards within Python
PyGWalker: Turn your pandas dataframe into an interactive UI for visual analysis
Sync data between persistence engines, like ETL only not stodgy
YTsaurus is a scalable and fault-tolerant open-source big data platform.
ReplicaDB is open source tool for database replication, designed for efficiently transferring bulk data between relational and non-relational databases
Cadence is a distributed, scalable, durable, and highly available orchestration engine to execute asynchronous long-running business logic in a scalable and resilient way.
C7 CE enters EOL in October 2025. Please check out C8 https://github.com/camunda/camunda – Flexible framework for workflow and decision automation with BPMN and DMN. Integration with Quarkus, Sprin…
SQLpipe makes it easy to move the result of one query from one database to another.
Cenit IO | Docs as a way to help you out with our platform and make integrations easily and well
Reshuffle is a lightweight and open source integration and workflow framework in NodeJS.
StackStorm (aka "IFTTT for Ops") is event-driven automation for auto-remediation, incident responses, troubleshooting, deployments, and more for DevOps and SREs. Includes rules engine, workflow, 16…
GridDB is a next-generation open source database that makes time series IoT and big data fast,and easy.
Awesome list of open-source startup alternatives to well-known SaaS products 🚀