- Budapest
- @treff7es
Stars
The Metadata Platform for your Data and AI Stack
data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
🎨 Diagram as Code for prototyping cloud system architectures
Data Apps & Dashboards for Python. No JavaScript Required.
Python disk-backed cache (Django-compatible). Faster than Redis and Memcached. Pure-Python.
📘 dict subclass with keylist/keypath support, built-in I/O operations (base64, csv, html, ini, json, pickle, plist, query-string, toml, xls, xml, yaml), s3 support and many utilities.
Schema modelling framework for decentralised domain-driven ownership of data.
A collection of JSON schema files including full API
PyAthena is a Python DB API 2.0 (PEP 249) client for Amazon Athena.
A non-validating SQL parser module for Python
Uses tokenized query returned by python-sqlparse and generates query metadata
SQL Lineage Analysis Tool powered by Python
PostgreSQL Languages AST and statements prettifier: master branch covers PG10, v2 branch covers PG12, v3 covers PG13, v4 covers PG14, v5 covers PG15, v6 covers PG16, v7 covers PG17
ThirdEye is an integrated tool for realtime monitoring of time series and interactive root-cause analysis. It enables anyone inside an organization to collaborate on effective identification and an…
A modular SQL linter and auto-formatter with support for multiple dialects and templated code.
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
A curated list of dagster code snippets for data engineers
Apache Pinot - A realtime distributed OLAP datastore
🐶 Kubernetes CLI To Manage Your Clusters In Style!
Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.
A load balancer / proxy / gateway for prestodb
Cruise Control Frontend (CCFE): Single Page Web Application to Manage Large Scale of Kafka Clusters
An orchestration platform for the development, production, and observation of data assets.
A framework for writing performant user-defined functions (UDFs) that are portable across a variety of engines including Apache Spark, Apache Hive, and Presto.
Airflow training for the crunch conf
A collection of tools for working with Apache Kafka.
Apache Superset is a Data Visualization and Data Exploration Platform