Stars
A flexible distributed key-value datastore that supports both caching and beyond caching workloads.
"rsync for cloud storage" - Google Drive, S3, Dropbox, Backblaze B2, One Drive, Swift, Hubic, Wasabi, Google Cloud Storage, Azure Blob, Azure Files, Yandex Files
The simplest, fastest way to get business intelligence and analytics to everyone in your company 😋
Nessie: Transactional Catalog for Data Lakes with Git-like semantics
Download and generate EPUB of your favorite books from O'Reilly Learning (aka Safari Books Online) library.
The Data Contract Specification Repository
Pulumi - Infrastructure as Code in any programming language 🚀
🦜🔗 Build context-aware reasoning applications
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Best-in-class stream processing, analytics, and management. Perform continuous analytics, or build event-driven applications, real-time ETL pipelines, and feature stores in minutes. Unified streami…
Apache Camel is an open source integration framework that empowers you to quickly and easily integrate various systems consuming or producing data.
The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
🐶 Kubernetes CLI To Manage Your Clusters In Style!
ZenML 🙏: The bridge between ML and Ops. https://zenml.io.
👾Scripts and samples to support Confluent Demos and Talks.
Hopsworks - Data-Intensive AI platform with a Feature Store
Open Source Feature Flagging and A/B Testing Platform
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️
Always know what to expect from your data.
The user-friendly command line shell.
Modin: Scale your Pandas workflows by changing a single line of code
Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.
A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.
Code to illustrate few examples of NN created from scratch
A curated list of awesome command-line frameworks, toolkits, guides and gizmos. Inspired by awesome-php.
Python library for creating data pipelines with chain functional programming