Skip to content
View LiuMicheal's full-sized avatar
  • Northwestern Polytechnical University
  • Xi'an,Shan'xi,China

Block or report LiuMicheal

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

RDMA CNI plugin for containerized workloads

Go 52 23 Updated May 13, 2025

Globally Addressable Memory management (efficient distributed memory management via RDMA and caching)

C++ 101 46 Updated Sep 2, 2023

Arbitrary offloads for RDMA NICs

C 91 20 Updated Apr 25, 2022

A Primer on Memory Consistency and Cache Coherence (Second Edition) 翻译计划

246 45 Updated May 5, 2024

DeepSeek Coder: Let the Code Write Itself

Python 21,558 2,464 Updated May 21, 2024

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 47,888 7,559 Updated May 23, 2025

Fast OS-level support for GPU checkpoint and restore

C++ 193 19 Updated May 21, 2025

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 3,241 256 Updated May 23, 2025

LLM inference in C/C++

C++ 80,721 11,874 Updated May 23, 2025

KV cache store for distributed LLM inference

C++ 237 20 Updated May 16, 2025

Efficient and easy multi-instance LLM serving

Python 414 31 Updated May 23, 2025

NVIDIA Linux open GPU kernel module source

C 15,807 1,402 Updated May 19, 2025

Public repo for farreach

C++ 16 Updated Feb 1, 2025

Justitia provides RDMA isolation between applications with diverse requirements.

C 40 7 Updated May 25, 2022

High performance Transformer implementation in C++.

C++ 122 16 Updated Jan 18, 2025

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

Python 2,010 183 Updated Mar 26, 2025

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 3,313 259 Updated May 22, 2025

A curated list of resources on event-driven architecture.

456 17 Updated Dec 13, 2024

A collection of awesome researchers and papers about disaggregated memory.

154 12 Updated Apr 18, 2025

本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)

HTML 17,802 2,097 Updated May 1, 2025

Heterogeneous AI Computing Virtualization Middleware

Go 1,633 298 Updated May 23, 2025

Survey: A collection of AWESOME papers and resources on the large language model (LLM) related recommender system topics.

1,304 75 Updated May 22, 2025

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Python 37,158 6,304 Updated May 23, 2025

Artifact evaluation repo for EuroSys'24.

Python 26 2 Updated Nov 7, 2023

Sharing the codebase and steps for artifact evaluation for ISCA 2023 paper

Python 14 5 Updated Feb 20, 2024

A List of Recommender Systems and Resources

4,706 702 Updated Feb 25, 2025

作为国内第一本glusterfs的书籍,方便大家学习了解

562 93 Updated Aug 22, 2022
Next