Skip to content
View unix1986's full-sized avatar
🦀
🦀

Block or report unix1986

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 13,552 946 Updated Apr 15, 2025

LeaderWorkerSet: An API for deploying a group of pods as a unit of replication

Go 397 69 Updated Apr 15, 2025

An Open-source RL System from ByteDance Seed and Tsinghua AIR

1,109 47 Updated Apr 10, 2025

The official Python SDK for Model Context Protocol servers and clients

Python 9,338 946 Updated Apr 15, 2025

High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.

Python 1,086 72 Updated Apr 14, 2025

A C++11 library for serialization

C++ 4,391 792 Updated Jan 20, 2025

A C++ header-only HTTP/HTTPS server and client library

C++ 14,108 2,413 Updated Apr 8, 2025

NCCL Tests

Cuda 1,063 272 Updated Mar 15, 2025

An extremely fast Python package and project manager, written in Rust.

Rust 49,731 1,385 Updated Apr 15, 2025

一种任务级GPU算力分时调度的高性能深度学习训练平台

Python 625 84 Updated Oct 24, 2023

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,556 263 Updated Apr 14, 2025

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 3,123 245 Updated Apr 15, 2025

ZeroMQ core engine in C++, implements ZMTP/3.1

C++ 10,102 2,395 Updated Dec 30, 2024

RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.

C++ 698 56 Updated Jan 21, 2025

Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3.

Python 1,170 129 Updated Apr 14, 2025

Fast, Flexible and Portable Structured Generation

C++ 875 66 Updated Apr 10, 2025

Official electron build of draw.io

JavaScript 53,883 5,233 Updated Apr 1, 2025

AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术

Jupyter Notebook 13,208 1,899 Updated Apr 10, 2025

Mirror clone of https://gitee.com/gsls200808/chinese-opensource-mirror-site as the README.md on that repository has been filtered.

302 29 Updated May 16, 2021

A highly optimized LLM inference acceleration engine for Llama and its variants.

C++ 885 104 Updated Apr 14, 2025

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 3,040 204 Updated Apr 15, 2025

This repository contains demos I made with the Transformers library by HuggingFace.

Jupyter Notebook 10,695 1,590 Updated Jan 13, 2025

Efficient Top-K implementation on the GPU

Cuda 175 21 Updated Apr 9, 2019

Infinity is a high-throughput, low-latency serving engine for text-embeddings, reranking models, clip, clap and colpali

Python 2,060 137 Updated Mar 28, 2025

A blazing fast inference solution for text embeddings models

Rust 3,426 242 Updated Apr 15, 2025

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Python 2,922 243 Updated Apr 14, 2025

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Python 39,340 4,975 Updated Aug 16, 2024

The best OSS video generation models

Python 3,095 334 Updated Jan 8, 2025

Open MPI main development repository

C 2,324 893 Updated Apr 15, 2025

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Python 7,452 650 Updated Feb 10, 2025
Next