Skip to content
View ChengZhang-98's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Block or report ChengZhang-98

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
12 stars written in Python
Clear filter

Fully open reproduction of DeepSeek-R1

Python 18,983 1,604 Updated Feb 11, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 9,320 889 Updated Feb 12, 2025

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Python 5,777 527 Updated Dec 14, 2024

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Python 4,633 495 Updated Feb 8, 2025

Efficient Triton Kernels for LLM Training

Python 4,396 263 Updated Feb 11, 2025

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Python 701 56 Updated Sep 4, 2024

Code for Neurips24 paper: QuaRot, an end-to-end 4-bit inference of large language models.

Python 339 30 Updated Nov 26, 2024

Kernel Tuner

Python 307 52 Updated Feb 11, 2025

PyTorch emulation library for Microscaling (MX)-compatible data formats

Python 196 30 Updated Sep 23, 2024

Machine-Learning Accelerator System Exploration Tools

Python 144 69 Updated Feb 11, 2025

Long Range Arena for Benchmarking Efficient Transformers

Python 5 Updated Sep 28, 2022