-
Imperial College London
- London
- https://chengzhang-98.github.io/blog/
- in/chengzhang98
Highlights
- Pro
Stars
Fully open reproduction of DeepSeek-R1
SGLang is a fast serving framework for large language models and vision language models.
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
Efficient Triton Kernels for LLM Training
FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.
Code for Neurips24 paper: QuaRot, an end-to-end 4-bit inference of large language models.
PyTorch emulation library for Microscaling (MX)-compatible data formats
Machine-Learning Accelerator System Exploration Tools
Long Range Arena for Benchmarking Efficient Transformers