Skip to content
View GuangyanZhang's full-sized avatar

Block or report GuangyanZhang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

SGLang is a fast serving framework for large language models and vision language models.

Python 12,000 1,266 Updated Mar 16, 2025

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Python 8,875 911 Updated Mar 13, 2025

Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 8,732 619 Updated Mar 7, 2025

llm-export can export llm model to onnx.

Python 271 30 Updated Jan 17, 2025

export llama to onnx

Python 115 14 Updated Dec 28, 2024

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 5,851 508 Updated Mar 16, 2025

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 13,335 2,735 Updated Mar 16, 2025

A unified library of state-of-the-art model optimization techniques such as quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deploym…

Python 787 58 Updated Mar 13, 2025

Official Pytorch repository for Extreme Compression of Large Language Models via Additive Quantization https://arxiv.org/pdf/2401.06118.pdf and PV-Tuning: Beyond Straight-Through Estimation for Ext…

Python 1,221 184 Updated Mar 3, 2025

Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees"

Python 362 33 Updated Feb 24, 2024

[MLSys'24] Atom: Low-bit Quantization for Efficient and Accurate LLM Serving

Cuda 298 26 Updated Jul 2, 2024

[EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".

Python 433 50 Updated Mar 13, 2025

A primitive library for neural network

C++ 1,322 218 Updated Nov 24, 2024

[ICML 2024] KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache

Python 279 27 Updated Jan 19, 2025

[IEEE T-PAMI 2023] Awesome BEV perception research and cookbook for all level audience in autonomous diriving

Python 1,252 109 Updated Jan 6, 2024
Python 78 5 Updated Dec 15, 2023

《代码随想录》LeetCode 刷题攻略:200道经典题目刷题顺序,共60w字的详细图解,视频难点剖析,50余张思维导图,支持C++,Java,Python,Go,JavaScript等多语言版本,从此算法学习不再迷茫!🔥🔥 来看看,你会发现相见恨晚!🚀

Shell 54,728 11,886 Updated Mar 14, 2025

大模型入门

18 5 Updated Mar 16, 2024

AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:

Python 2,010 254 Updated Mar 6, 2025

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilizatio…

Python 2,274 380 Updated Mar 15, 2025

PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT

Python 2,705 361 Updated Mar 16, 2025

✨✨Latest Advances on Multimodal Large Language Models

14,283 918 Updated Mar 14, 2025

Open-Sora: Democratizing Efficient Video Production for All

Python 25,148 2,431 Updated Mar 15, 2025
Python 1,299 186 Updated Mar 14, 2025

[ICCV 2023] Q-Diffusion: Quantizing Diffusion Models.

Python 346 24 Updated Mar 21, 2024

A pytorch quantization backend for optimum

Python 900 70 Updated Mar 6, 2025

Public repo for HF blog posts

Jupyter Notebook 2,778 840 Updated Mar 15, 2025

OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.

C++ 7,427 815 Updated Mar 14, 2025
Next
Showing results