[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…

Jupyter Notebook 7,117 451 Updated Mar 22, 2025

bytedance / ShadowKV

ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference

Python 153 7 Updated Oct 30, 2024

volcengine / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 5,752 571 Updated Mar 27, 2025

hemingkx / SpeculativeDecodingPapers

📰 Must-read papers and blogs on Speculative Decoding ⚡️

661 33 Updated Mar 27, 2025

jax-ml / jax

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

Python 31,762 2,964 Updated Mar 27, 2025

pytorch-labs / gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Python 5,898 546 Updated Mar 13, 2025

facebookresearch / xformers

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 9,235 652 Updated Mar 25, 2025

speedyapply / 2025-AI-College-Jobs

2025 AI/ML internship & new graduate job list updated daily

955 27 Updated Mar 27, 2025

usyd-fsalab / fp6_llm

An efficient GPU support for LLM inference with x-bit quantization (e.g. FP6,FP5).

Cuda 242 17 Updated Oct 28, 2024

AlibabaResearch / flash-llm

Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity

Cuda 202 16 Updated Sep 24, 2023

opencsapp / opencsapp.github.io

Open CS Application | 开源CS申请

JavaScript 2,106 244 Updated Feb 23, 2025

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

Cuda 2,502 259 Updated Mar 27, 2025

efeslab / Nanoflow

A throughput-oriented high-performance serving framework for LLMs

Cuda 782 31 Updated Sep 21, 2024

Starred topics

machine-learning-interview-questions

Mingheng Wu wmhst7

Lists (5)

C++

LLM

MLSys

🚀 My stack

Robotics

Starred repositories

machine-learning-interview-questions

Qt