Skip to content
View tobehuang's full-sized avatar

Block or report tobehuang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)

HTML 13,942 1,592 Updated Jan 15, 2025

Implementation of popular deep learning networks with TensorRT network definition API

C++ 7,168 1,797 Updated Dec 6, 2024

Simple samples for TensorRT programming

Python 1,570 344 Updated Dec 18, 2024

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

C++ 11,156 2,152 Updated Feb 1, 2025

Simple, safe way to store and distribute tensors

Python 3,087 218 Updated Feb 6, 2025

Low-bit LLM inference on CPU with lookup table

C++ 667 50 Updated Jan 9, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 9,208 879 Updated Feb 11, 2025

Fast and memory-efficient exact attention

Python 15,395 1,450 Updated Feb 10, 2025

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 9,366 1,096 Updated Feb 11, 2025

RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.

C++ 621 55 Updated Jan 21, 2025

State-of-the-art 2D and 3D Face Analysis Project

Python 24,218 5,486 Updated Dec 5, 2024

Tensor library for machine learning

C++ 11,807 1,115 Updated Feb 9, 2025

通义千问VLLM推理部署DEMO

Python 507 74 Updated Mar 28, 2024

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 2,486 147 Updated Feb 8, 2025

OneDiff: An out-of-the-box acceleration library for diffusion models.

Jupyter Notebook 1,805 118 Updated Jan 13, 2025

[EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.

Python 4,852 271 Updated Jan 26, 2025

C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)

C++ 2,964 334 Updated Jul 31, 2024

Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.

Python 4,846 481 Updated Aug 6, 2024

DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including CUDA, x86 and ARMv9.

C 224 24 Updated Feb 11, 2025

Seamless operability between C++11 and Python

C++ 16,145 2,129 Updated Feb 8, 2025

A template matching library based on OpenCV, supporting rotation matching, cross-platform usage, C++, and Python. 基于opencv的模板匹配库,支持旋转匹配,支持跨平台、c++调用、python调用

C++ 34 9 Updated May 19, 2024

Open standard for machine learning interoperability

Python 18,392 3,711 Updated Feb 9, 2025

Use PEFT or Full-parameter to finetune 450+ LLMs (Qwen2.5, InternLM3, GLM4, Llama3.3, Mistral, Yi1.5, Baichuan2, DeepSeek-R1, ...) and 150+ MLLMs (Qwen2.5-VL, Qwen2-Audio, Llama3.2-Vision, Llava, I…

Python 5,323 462 Updated Feb 11, 2025

Serve, optimize and scale PyTorch models in production

Java 4,289 871 Updated Feb 6, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 37,245 5,605 Updated Feb 11, 2025

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 138,983 27,889 Updated Feb 10, 2025

Concurrently chat with ChatGPT, Bing Chat, Bard, Alpaca, Vicuna, Claude, ChatGLM, MOSS, 讯飞星火, 文心一言 and more, discover the best answers

JavaScript 15,525 1,657 Updated Feb 6, 2025

LLM inference in C/C++

C++ 73,826 10,648 Updated Feb 10, 2025
Next