Stars
🔥LeetCode solutions in any programming language | 多种编程语言实现 LeetCode、《剑指 Offer(第 2 版)》、《程序员面试金典(第 6 版)》题解
Jitsu is an open-source Segment alternative. Fully-scriptable data ingestion engine for modern data teams. Set-up a real-time data pipeline in minutes, not days
🤖 Chat with your SQL database 📊. Accurate Text-to-SQL Generation via LLMs using RAG 🔄.
An enterprise-class UI design language and React UI library
A curated awesome list of lists of interview questions. Feel free to contribute! 🎓
Dynamically generate Apache Airflow DAGs from YAML configuration files
Run your dbt Core projects as Apache Airflow DAGs and Task Groups with a few lines of code
A collection of Airflow operators, hooks, and utilities to elevate dbt to a first-class citizen of Airflow.
Learn how to design systems at scale and prepare for system design interviews
This is a repo with links to everything you'd ever want to learn about data engineering
Yet another directed acyclic graph (DAG) implementation in golang.
A comprehensive guide to implementing real-time change data capture from MySQL using Debezium, Confluent Schema Registry, Spark Structured Streaming, and Apache Iceberg
Extensible streaming ingestion pipeline on top of Apache Spark
Visualize column-level data lineage in Spark SQL
Technical interview questions for backend engineer.
This repo used to demonstrate the features that volcano can bring to spark operator
This project provides a reverse proxy for Spark UI on Kubernetes
A UI for managing Spark applications with the Spark Operator
This is a repository containing the list of company wise questions available on leetcode premium. Every pdf file in this repository corresponds to a list of questions on leetcode for a specific com…
Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Open, Multi-modal Catalog for Data & AI
Cuckoo Filter: Practically Better Than Bloom
Blazing-fast query execution engine speaks Apache Spark language and has Arrow-DataFusion at its core.
Chat and Ask on your own data. Accelerator to quickly upload your own enterprise data and use OpenAI services to chat to that uploaded data and ask questions