-
Adobe
- San Jose, CA
Stars
🦜🔗 Build context-aware reasoning applications
The easy-to-use open source Business Intelligence and Embedded Analytics tool that lets everyone work with data 📊
SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 14+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
A library that provides an embeddable, persistent key-value store for fast storage.
Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Diem’s mission is to build a trusted and innovative financial network that empowers people and businesses around the world.
A time-series database for high-performance real-time analytics packaged as a Postgres extension
The open and composable observability and data visualization platform. Visualize metrics, logs, and traces from multiple sources like Prometheus, Loki, Elasticsearch, InfluxDB, Postgres and many mo…
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
Repo for counting stars and contributing. Press F to pay respect to glorious developers.
Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.
Go implementation of the Ethereum protocol
ClickHouse® is a real-time analytics database management system
😎 A curated list of amazingly awesome Flink and Flink ecosystem resources
Spark Gotchas. A subjective compilation of the Apache Spark tips and tricks
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning …
Apache Spark - A unified analytics engine for large-scale data processing
Akka Streams & Akka HTTP for Large-Scale Production Deployments
Examples of Data Science Tools & Libraries
wangmiao1981 / realdeal
Forked from pitzer/realdealA data pipeline to scrape and process data from real estate websites.