Skip to content
View wangmiao1981's full-sized avatar

Block or report wangmiao1981

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

🦜🔗 Build context-aware reasoning applications

Jupyter Notebook 101,873 16,519 Updated Feb 28, 2025

The easy-to-use open source Business Intelligence and Embedded Analytics tool that lets everyone work with data 📊

Clojure 41,021 5,396 Updated Mar 1, 2025

SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 14+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.

Python 7,361 579 Updated Mar 1, 2025

A library that provides an embeddable, persistent key-value store for fast storage.

C++ 29,208 6,420 Updated Feb 27, 2025

Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!

C++ 10,051 604 Updated Mar 1, 2025

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

Java 10,915 3,113 Updated Feb 28, 2025

Apache Iceberg

Java 6,957 2,394 Updated Feb 28, 2025

Convolutional Neural Networks

C 26,038 21,328 Updated May 3, 2024

YAD2K: Yet Another Darknet 2 Keras

Python 2,720 879 Updated Dec 16, 2020

Diem’s mission is to build a trusted and innovative financial network that empowers people and businesses around the world.

Rust 16,698 2,590 Updated Feb 25, 2025

A time-series database for high-performance real-time analytics packaged as a Postgres extension

C 18,466 915 Updated Feb 28, 2025

The open and composable observability and data visualization platform. Visualize metrics, logs, and traces from multiple sources like Prometheus, Loki, Elasticsearch, InfluxDB, Postgres and many mo…

TypeScript 66,719 12,429 Updated Mar 1, 2025

An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs

Scala 7,835 1,778 Updated Feb 28, 2025

Repo for counting stars and contributing. Press F to pay respect to glorious developers.

270,329 21,126 Updated Oct 3, 2024

Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.

Go 2,869 1,400 Updated Mar 1, 2025

Go implementation of the Ethereum protocol

Go 48,480 20,625 Updated Feb 28, 2025

ClickHouse® is a real-time analytics database management system

C++ 39,255 7,136 Updated Mar 1, 2025

😎 A curated list of amazingly awesome Flink and Flink ecosystem resources

776 113 Updated Jun 8, 2023

成为专业程序员路上用到的各种优秀资料、神器及框架

9,696 2,120 Updated Feb 21, 2023

JVM related exercises

Scala 11 6 Updated Jul 16, 2017

Apache Flink

Java 24,568 13,520 Updated Feb 28, 2025

Deep Learning for humans

Python 62,634 19,526 Updated Feb 28, 2025

Spark Gotchas. A subjective compilation of the Apache Spark tips and tricks

363 80 Updated Jun 6, 2017

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning …

C++ 16,985 3,863 Updated Feb 28, 2025

Apache Spark - A unified analytics engine for large-scale data processing

Scala 40,632 28,521 Updated Mar 1, 2025

Akka Streams & Akka HTTP for Large-Scale Production Deployments

Scala 1,434 236 Updated Apr 17, 2024

Spark Multiuser Benchmark

C 5 20 Updated Jun 8, 2017

Examples of Data Science Tools & Libraries

1 Updated Apr 23, 2016

A data pipeline to scrape and process data from real estate websites.

HTML 1 Updated Sep 12, 2016
Next