wx-csy

👋

bonjour

Shaoyuan CHEN wx-csy

👋

bonjour

MadSys@Tsinghua

138 followers · 61 following

Tsinghua University
Beijing, China

Achievements

x3 x2

Achievements

x3 x2

Highlights

Organizations

Starred repositories

deepseek-ai / 3FS

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 8,585 851 Updated Apr 4, 2025

deepseek-ai / EPLB

Expert Parallelism Load Balancer

Python 1,140 187 Updated Mar 24, 2025

deepseek-ai / profile-data

Analyze computation-communication overlap in V3/R1.

994 141 Updated Mar 21, 2025

deepseek-ai / DualPipe

A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.

Python 2,721 289 Updated Mar 10, 2025

deepseek-ai / DeepGEMM

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Python 5,213 562 Updated Apr 16, 2025

deepseek-ai / DeepEP

DeepEP: an efficient expert-parallel communication library

Cuda 7,436 709 Updated Apr 16, 2025

deepseek-ai / FlashMLA

FlashMLA: Efficient MLA decoding kernels

C++ 11,439 821 Updated Mar 1, 2025

deepseek-ai / open-infra-index

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,594 264 Updated Apr 14, 2025

MoonshotAI / MoBA

MoBA: Mixture of Block Attention for Long-Context LLMs

Python 1,738 104 Updated Apr 3, 2025

tinygrad / open-gpu-kernel-modules

Forked from NVIDIA/open-gpu-kernel-modules

NVIDIA Linux open GPU with P2P support

C 1,081 105 Updated Dec 18, 2024

gkamradt / LLMTest_NeedleInAHaystack

Doing simple retrieval from LLM models at various context lengths to measure accuracy

Jupyter Notebook 1,819 194 Updated Aug 17, 2024

aliireza / ddio-bench

Reexamining Direct Cache Access to Optimize I/O Intensive Applications for Multi-hundred-gigabit Networks

Makefile 93 19 Updated Sep 2, 2021

NVIDIA / gdrcopy

A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology

C++ 1,042 157 Updated Mar 26, 2025

kvcache-ai / ktransformers

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 13,568 947 Updated Apr 16, 2025

kvcache-ai / Mooncake

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 3,047 206 Updated Apr 16, 2025

BBuf / how-to-optim-algorithm-in-cuda

how to optimize some algorithm in cuda.

Cuda 2,105 187 Updated Apr 14, 2025

NVIDIA / cutlass

CUDA Templates for Linear Algebra Subroutines

C++ 7,293 1,196 Updated Apr 10, 2025

huggingface / text-generation-inference

Large Language Model Text Generation Inference

Python 10,012 1,181 Updated Apr 16, 2025

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 45,004 6,894 Updated Apr 16, 2025

NVIDIA / FasterTransformer

Transformer related optimization, including BERT, GPT

C++ 6,123 901 Updated Mar 27, 2024

google / rust-crate-audits

250 12 Updated Mar 22, 2025

meta-llama / llama

Inference code for Llama models

Python 58,103 9,739 Updated Jan 26, 2025

intel / pcm

Intel® Performance Counter Monitor (Intel® PCM)

C++ 2,958 484 Updated Apr 16, 2025

yaobaiwei / Grasper

Grasper: A High Performance Distributed System for OLAP on Property Graphs.

C++ 31 9 Updated Apr 3, 2021

ciaranm / glasgow-subgraph-solver

A solver for subgraph isomorphism problems, based upon a series of papers by subsets of McCreesh, Prosser, and Trimble.

C++ 77 24 Updated Feb 21, 2025

ciaranm / cp2015-subgraph-isomorphism

CP 2015 subgraph isomorphism experiments, data and paper

C++ 13 5 Updated Sep 5, 2015

arminbiere / kissat

C 524 93 Updated Mar 17, 2025

chenxuhao / GraphMiner

Graph Pattern Mining

C++ 88 18 Updated Sep 20, 2024

plctlab / PLCT-Open-Reports

PLCT实验室的公开演讲，或者决定公开的组内报告

1,071 158 Updated Dec 12, 2024

memgraph / memgraph

Open-source graph database, tuned for dynamic analytics environments. Easy to adopt, scale and own.

C++ 2,750 143 Updated Apr 16, 2025

Shaoyuan CHEN wx-csy

Highlights

Organizations

Starred repositories

program-synthesis