[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…

Jupyter Notebook 7,064 452 Updated Mar 22, 2025

bytedance / ShadowKV

ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference

Python 153 7 Updated Oct 30, 2024

volcengine / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 5,578 548 Updated Mar 24, 2025

hemingkx / SpeculativeDecodingPapers

📰 Must-read papers and blogs on Speculative Decoding ⚡️

656 32 Updated Mar 21, 2025

jax-ml / jax

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

Python 31,733 2,961 Updated Mar 24, 2025

pytorch-labs / gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Python 5,896 545 Updated Mar 13, 2025

facebookresearch / xformers

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 9,222 653 Updated Mar 24, 2025

speedyapply / 2025-AI-College-Jobs

2025 AI/ML internship & new graduate job list updated daily

946 27 Updated Mar 24, 2025

usyd-fsalab / fp6_llm

An efficient GPU support for LLM inference with x-bit quantization (e.g. FP6,FP5).

Cuda 240 17 Updated Oct 28, 2024

AlibabaResearch / flash-llm

Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity

Cuda 202 16 Updated Sep 24, 2023

opencsapp / opencsapp.github.io

Open CS Application | 开源CS申请

JavaScript 2,104 244 Updated Feb 23, 2025

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

Cuda 2,467 258 Updated Mar 24, 2025

efeslab / Nanoflow

A throughput-oriented high-performance serving framework for LLMs

Cuda 777 30 Updated Sep 21, 2024

interestingLSY / swiftLLM

A tiny yet powerful LLM inference system tailored for researching purpose. vLLM-equivalent performance with only 2k lines of code (2% of vLLM).

Python 152 12 Updated Jul 5, 2024

LLMServe / DistServe

Disaggregated serving system for Large Language Models (LLMs).

Jupyter Notebook 512 53 Updated Aug 19, 2024

AmadeusChan / Awesome-LLM-System-Papers

560 26 Updated Mar 7, 2025

Starred topics

machine-learning-interview-questions

Mingheng Wu wmhst7

Lists (5)

C++

LLM

MLSys

🚀 My stack

Robotics

Starred repositories

machine-learning-interview-questions

Qt