-
Imperial College London
- London
- https://chengzhang-98.github.io/blog/
- in/chengzhang98
Highlights
- Pro
Stars
Fully open reproduction of DeepSeek-R1
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
This repository contains demos I made with the Transformers library by HuggingFace.
SGLang is a fast serving framework for large language models and vision language models.
Implementation of Microscaling data formats in SystemVerilog.
tpu-systolic-array-weight-stationary
Code for Neurips24 paper: QuaRot, an end-to-end 4-bit inference of large language models.
FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
All in one vscode plugin for HDL development
Verilog AXI components for FPGA implementation
Machine-Learning Accelerator System Exploration Tools
PyTorch emulation library for Microscaling (MX)-compatible data formats
the resources I use to learn computer science in my spare time
A modern Neovim configuration with full battery for Python, Lua, C++, Markdown, LaTeX, and more...
高颜值的第三方网易云播放器,支持 Windows / macOS / Linux
Long Range Arena for Benchmarking Efficient Transformers
MLNLP: This repository is a collection of AI top conferences papers (e.g. ACL, EMNLP, NAACL, COLING, AAAI, IJCAI, ICLR, NeurIPS, and ICML) with open resource code
This repository contains source code to binarize any real-value word embeddings into binary vectors.