zhuohan123

Follow

Zhuohan Li zhuohan123

Follow

🎓 cs phd @ 🌁 uc berkeley | building @vllm-project | machine learning system | the real agi is the friends we made along the way

1k followers · 129 following

UC Berkeley
San Francisco Bay Area
16:20 (UTC -08:00)
https://zhuohan.li
@zhuohan123
in/zhuohan-li

Achievements

Achievements

Organizations

Stars

genmoai / mochi

The best OSS video generation models

Python 2,570 263 Updated Dec 18, 2024

google / pyglove

Manipulating Python Programs

Python 620 27 Updated Dec 20, 2024

vllm-project / llm-compressor

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Python 815 68 Updated Dec 27, 2024

efeslab / Nanoflow

A throughput-oriented high-performance serving framework for LLMs

Cuda 671 27 Updated Sep 21, 2024

microsoft / vattention

Dynamic Memory Management for Serving LLMs without PagedAttention

C 262 16 Updated Dec 6, 2024

EleutherAI / lm-evaluation-harness

A framework for few-shot evaluation of language models.

Python 7,318 1,973 Updated Dec 25, 2024

bytedance / flux

A fast communication-overlapping library for tensor parallelism on GPUs.

C++ 254 21 Updated Oct 30, 2024

EricLBuehler / mistral.rs

Blazingly fast LLM inference.

Rust 4,728 327 Updated Dec 27, 2024

All-Hands-AI / OpenHands

🙌 OpenHands: Code Less, Make More

Python 39,236 4,424 Updated Dec 28, 2024

HabanaAI / vllm-fork

Forked from vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 45 63 Updated Dec 26, 2024

HazyResearch / ThunderKittens

Tile primitives for speedy kernels

Cuda 1,840 86 Updated Dec 23, 2024

NaiboWang / EasySpider

A visual no-code/code-free web crawler/spider易采集：一个可视化浏览器自动化测试/数据采集/爬虫软件，可以无代码图形化的设计和执行爬虫任务。别名：ServiceWrapper面向Web应用的智能化服务封装系统。

JavaScript 36,690 4,492 Updated Dec 23, 2024

HPMLL / BurstGPT

A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems

Python 141 8 Updated Oct 15, 2024

lmarena / arena-hard-auto

Arena-Hard-Auto: An automatic LLM benchmark.

Jupyter Notebook 690 81 Updated Dec 14, 2024

openai / simple-evals

Python 2,093 181 Updated Dec 18, 2024

zeux / calm

CUDA/Metal accelerated language model inference

C 469 19 Updated Dec 18, 2024

stanfordnlp / dspy

DSPy: The framework for programming—not prompting—language models

Python 20,520 1,550 Updated Dec 27, 2024

axonn-ai / axonn

A parallel framework for training deep neural networks

Python 49 5 Updated Dec 14, 2024

hao-ai-lab / Consistency_LLM

[ICML 2024] CLLMs: Consistency Large Language Models

Python 363 18 Updated Nov 16, 2024

young-geng / scalax

A simple library for scaling up JAX programs

Python 128 10 Updated Nov 2, 2024

xai-org / grok-1

Grok open release

Python 49,752 8,346 Updated Aug 30, 2024

mlc-ai / mlc-llm

Universal LLM Deployment Engine with ML Compilation

Python 19,490 1,602 Updated Dec 19, 2024

kserve / kserve

Standardized Serverless ML Inference Platform on Kubernetes

Python 3,734 1,088 Updated Dec 27, 2024

OptimalScale / LMFlow

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.

Python 8,315 826 Updated Dec 26, 2024

NVIDIA / cuda-python

CUDA Python: Performance meets Productivity

Python 1,015 84 Updated Dec 27, 2024

NVIDIA / TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilizatio…

Python 2,042 336 Updated Dec 20, 2024

LargeWorldModel / LWM

Large World Model -- Modeling Text and Video with Millions Context

Python 7,189 555 Updated Oct 19, 2024

leptonai / search_with_lepton

Building a quick conversation-based search demo with Lepton AI.

TypeScript 7,901 1,010 Updated Dec 18, 2024

run-llama / llama_index

LlamaIndex is a data framework for your LLM applications

Python 37,619 5,405 Updated Dec 27, 2024

google-deepmind / alphageometry

Python 4,232 478 Updated Oct 25, 2024