Skip to content
View LiuMicheal's full-sized avatar
  • Northwestern Polytechnical University
  • Xi'an,Shan'xi,China

Block or report LiuMicheal

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 33,750 5,160 Updated Jan 15, 2025

Fast OS-level support for GPU checkpoint and restore

C 133 11 Updated Jan 7, 2025

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 2,758 220 Updated Jan 14, 2025

LLM inference in C/C++

C++ 70,704 10,217 Updated Jan 15, 2025

A distributed KV store for disaggregated LLM inference

C++ 22 2 Updated Jan 14, 2025

Efficient and easy multi-instance LLM serving

Python 276 17 Updated Jan 14, 2025

NVIDIA Linux open GPU kernel module source

C 15,419 1,320 Updated Dec 17, 2024

Public repo for farreach

C++ 15 Updated May 5, 2024

Justitia provides RDMA isolation between applications with diverse requirements.

C 39 7 Updated May 25, 2022

High performance Transformer implementation in C++.

C++ 98 14 Updated Jan 15, 2025

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

Python 1,941 176 Updated Nov 20, 2024

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 2,366 134 Updated Jan 14, 2025

A curated list of resources on event-driven architecture.

420 16 Updated Dec 13, 2024

A collection of awesome researchers and papers about disaggregated memory.

136 12 Updated Dec 30, 2024

本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)

HTML 12,864 1,422 Updated Jan 14, 2025

Heterogeneous AI Computing Virtualization Middleware

Go 1,198 244 Updated Jan 14, 2025

Survey: A collection of AWESOME papers and resources on the large language model (LLM) related recommender system topics.

1,143 66 Updated Jan 13, 2025

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Python 34,882 5,931 Updated Jan 14, 2025

Artifact evaluation repo for EuroSys'24.

Python 25 2 Updated Nov 7, 2023

Sharing the codebase and steps for artifact evaluation for ISCA 2023 paper

Python 15 5 Updated Feb 20, 2024

A List of Recommender Systems and Resources

4,631 692 Updated Jun 12, 2024

作为国内第一本glusterfs的书籍,方便大家学习了解

553 90 Updated Aug 22, 2022

User documentation for Knative components.

JavaScript 4,602 1,237 Updated Jan 14, 2025

Source code of Fuyao, built on Nightcore

C++ 7 6 Updated Mar 8, 2024

document

188 88 Updated Sep 1, 2015

Easy to use RDMA API in Rust async

Rust 388 47 Updated Nov 21, 2023

Underlay and RDMA network solution of the Kubernetes, for bare metal, VM and any public cloud

Go 547 82 Updated Jan 13, 2025

rFaaS: a high-performance FaaS platform with RDMA acceleration for low-latency invocations.

C++ 49 15 Updated Jan 14, 2025
C 6 2 Updated Nov 4, 2024
Next