GuangyanZhang

LMM GuangyanZhang

8 followers · 12 following

Achievements

Stars

sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.

Python 12,000 1,266 Updated Mar 16, 2025

deepseek-ai / DeepSeek-V3

Python 92,293 14,979 Updated Mar 16, 2025

miseon119 / onnx-graphsurgeon-notes

6 Updated Jun 27, 2022

modelscope / FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Python 8,875 911 Updated Mar 13, 2025

QwenLM / Qwen2.5-VL

Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 8,732 619 Updated Mar 7, 2025

wangzhaode / llm-export

llm-export can export llm model to onnx.

Python 271 30 Updated Jan 17, 2025

luchangli03 / export_llama_to_onnx

export llama to onnx

Python 115 14 Updated Dec 28, 2024

InternLM / lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 5,851 508 Updated Mar 16, 2025

NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 13,335 2,735 Updated Mar 16, 2025

NVIDIA / TensorRT-Model-Optimizer

A unified library of state-of-the-art model optimization techniques such as quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deploym…

Python 787 58 Updated Mar 13, 2025

Vahe1994 / AQLM

Official Pytorch repository for Extreme Compression of Large Language Models via Additive Quantization https://arxiv.org/pdf/2401.06118.pdf and PV-Tuning: Beyond Straight-Through Estimation for Ext…

Python 1,221 184 Updated Mar 3, 2025

Cornell-RelaxML / QuIP

Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees"

Python 362 33 Updated Feb 24, 2024

efeslab / Atom

[MLSys'24] Atom: Low-bit Quantization for Efficient and Accurate LLM Serving

Cuda 298 26 Updated Jul 2, 2024

ModelTC / llmc

[EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".

Python 433 50 Updated Mar 13, 2025

OpenPPL / ppl.nn

A primitive library for neural network

C++ 1,322 218 Updated Nov 24, 2024

jy-yuan / KIVI

[ICML 2024] KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache

Python 279 27 Updated Jan 19, 2025

OpenDriveLab / Birds-eye-view-Perception

[IEEE T-PAMI 2023] Awesome BEV perception research and cookbook for all level audience in autonomous diriving

Python 1,252 109 Updated Jan 6, 2024

Ascend / AscendSpeed

Python 78 5 Updated Dec 15, 2023

youngyangyang04 / leetcode-master

《代码随想录》LeetCode 刷题攻略：200道经典题目刷题顺序，共60w字的详细图解，视频难点剖析，50余张思维导图，支持C++，Java，Python，Go，JavaScript等多语言版本，从此算法学习不再迷茫！🔥🔥 来看看，你会发现相见恨晚！🚀

Shell 54,728 11,886 Updated Mar 14, 2025

csitfun / llm

大模型入门

18 5 Updated Mar 16, 2024

casper-hansen / AutoAWQ

AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:

Python 2,010 254 Updated Mar 6, 2025

NVIDIA / TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilizatio…

Python 2,274 380 Updated Mar 15, 2025

pytorch / TensorRT

PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT

Python 2,705 361 Updated Mar 16, 2025

BradyFU / Awesome-Multimodal-Large-Language-Models

✨✨Latest Advances on Multimodal Large Language Models

14,283 918 Updated Mar 14, 2025

hpcaitech / Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Python 25,148 2,431 Updated Mar 15, 2025

databricks / megablocks

Python 1,299 186 Updated Mar 14, 2025

Xiuyu-Li / q-diffusion

[ICCV 2023] Q-Diffusion: Quantizing Diffusion Models.

Python 346 24 Updated Mar 21, 2024

huggingface / optimum-quanto

A pytorch quantization backend for optimum

Python 900 70 Updated Mar 6, 2025

huggingface / blog

Public repo for HF blog posts

Jupyter Notebook 2,778 840 Updated Mar 15, 2025

Oneflow-Inc / oneflow

OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.

C++ 7,427 815 Updated Mar 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LMM GuangyanZhang

Achievements

Achievements

Block or report GuangyanZhang

Stars

sgl-project / sglang

deepseek-ai / DeepSeek-V3

miseon119 / onnx-graphsurgeon-notes

modelscope / FunASR

QwenLM / Qwen2.5-VL

wangzhaode / llm-export

luchangli03 / export_llama_to_onnx

InternLM / lmdeploy

NVIDIA / NeMo

NVIDIA / TensorRT-Model-Optimizer

Vahe1994 / AQLM

Cornell-RelaxML / QuIP

efeslab / Atom

ModelTC / llmc

OpenPPL / ppl.nn

jy-yuan / KIVI

OpenDriveLab / Birds-eye-view-Perception

Ascend / AscendSpeed

youngyangyang04 / leetcode-master

csitfun / llm

casper-hansen / AutoAWQ

NVIDIA / TransformerEngine

pytorch / TensorRT

BradyFU / Awesome-Multimodal-Large-Language-Models

hpcaitech / Open-Sora

databricks / megablocks

Xiuyu-Li / q-diffusion

huggingface / optimum-quanto

huggingface / blog

Oneflow-Inc / oneflow