Stars
High-Performance Python Compute Engine for Data and AI
aider is AI pair programming in your terminal
BtrBlocks: Efficient Columnar Compression for Data Lakes (SIGMOD 2023 Paper)
Up to 200x Faster Dot Products & Similarity Metrics — for Python, Rust, C, JS, and Swift, supporting f64, f32, f16 real & complex, i8, and bit vectors using SIMD for both AVX2, AVX-512, NEON, SVE, …
Open, Multi-modal Catalog for Data & AI
SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 14+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
Foundational Models for State-of-the-Art Speech and Text Translation
Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 2, and other large language models.
Bokeh Plotting Backend for Pandas and GeoPandas
Aim 💫 — An easy-to-use & supercharged open-source experiment tracker.
World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
Performance-portable, length-agnostic SIMD with runtime dispatch
A collection of libraries to optimise AI model performances
REST Book is a Visual Studio Code extension that allows you to perform REST calls in a Notebook interface.
GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
Distributed SQL Query Engine in Python using Ray
Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
A modular acceleration toolkit for big data analytic engines
A latent text-to-image diffusion model
A list of learning materials to understand databases internals
Warp is a modern, Rust-based terminal with AI built in so you and your team can build great software, faster.
Define and run multi-container applications with Docker
Uniffle is a high performance, general purpose Remote Shuffle Service.
Zenith - sort of like top or htop but with zoom-able charts, CPU, GPU, network, and disk usage
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
《Designing Data-Intensive Application》DDIA中文翻译