LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 3,241 256 Updated May 23, 2025

ggml-org / llama.cpp

LLM inference in C/C++

C++ 80,721 11,874 Updated May 23, 2025

bytedance / InfiniStore

KV cache store for distributed LLM inference

C++ 237 20 Updated May 16, 2025

AlibabaPAI / llumnix

Efficient and easy multi-instance LLM serving

Python 414 31 Updated May 23, 2025

NVIDIA / open-gpu-kernel-modules

NVIDIA Linux open GPU kernel module source

C 15,807 1,402 Updated May 19, 2025

Mellanox / nv_peer_memory

C 342 68 Updated Apr 23, 2024

LGN520 / farreach-public

Public repo for farreach

C++ 16 Updated Feb 1, 2025

SymbioticLab / Justitia

Justitia provides RDMA isolation between applications with diverse requirements.

C 40 7 Updated May 25, 2022

LLMServe / SwiftTransformer

High performance Transformer implementation in C++.

C++ 122 16 Updated Jan 18, 2025

deepspeedai / DeepSpeed-MII

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

Python 2,010 183 Updated Mar 26, 2025

kvcache-ai / Mooncake

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 3,313 259 Updated May 22, 2025

lutzh / awesome-event-driven-architecture

A curated list of resources on event-driven architecture.

456 17 Updated Dec 13, 2024

dmemsys / awesome-disaggregated-memory

A collection of awesome researchers and papers about disaggregated memory.

154 12 Updated Apr 18, 2025

liguodongiot / llm-action

本项目旨在分享大模型相关技术原理以及实战经验（大模型工程化、大模型应用落地）

HTML 17,802 2,097 Updated May 1, 2025

Project-HAMi / HAMi

Heterogeneous AI Computing Virtualization Middleware

Go 1,633 298 Updated May 23, 2025

CHIANGEL / Awesome-LLM-for-RecSys

Survey: A collection of AWESOME papers and resources on the large language model (LLM) related recommender system topics.

1,304 75 Updated May 22, 2025

ray-project / ray

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Python 37,158 6,304 Updated May 23, 2025

ProjectMitosisOS / dmerge-eurosys24-ae

Artifact evaluation repo for EuroSys'24.

Python 26 2 Updated Nov 7, 2023

rishucoding / reproduce_isca23_cpu_DLRM_inference

Sharing the codebase and steps for artifact evaluation for ISCA 2023 paper

Python 14 5 Updated Feb 20, 2024

grahamjenson / list_of_recommender_systems

A List of Recommender Systems and Resources

4,706 702 Updated Feb 25, 2025

httaotao / glusterfs-book

作为国内第一本glusterfs的书籍，方便大家学习了解

562 93 Updated Aug 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mingxuan Liu LiuMicheal

Block or report LiuMicheal

Starred repositories

k8snetworkplumbingwg / rdma-cni

ooibc88 / gam

redn-io / RedN

kaitoukito / A-Primer-on-Memory-Consistency-and-Cache-Coherence

deepseek-ai / DeepSeek-Coder

deepseek-ai / DeepSeek-R1

deepseek-ai / DeepSeek-V3

vllm-project / vllm

SJTU-IPADS / PhoenixOS

ModelTC / lightllm