Skip to content
View HUSTHY's full-sized avatar
💭
I may be slow to respond.
💭
I may be slow to respond.

Block or report HUSTHY

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
7 stars written in C++
Clear filter

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 9,107 1,050 Updated Jan 8, 2025

Transformer related optimization, including BERT, GPT

C++ 5,973 895 Updated Mar 27, 2024

🛠 A lite C++ toolkit of 100+ Awesome AI models, support ORT, MNN, NCNN, TNN and TensorRT. 🎉🎉

C++ 3,715 706 Updated Dec 26, 2024

纯c++的全平台llm加速库,支持python调用,chatglm-6B级模型单卡可达10000+token / s,支持glm, llama, moss基座,手机端流畅运行

C++ 3,364 348 Updated Dec 23, 2024

Fast implementation of BERT inference directly on NVIDIA (CUDA, CUBLAS) and Intel MKL

C++ 541 85 Updated Nov 18, 2020

用C++实现一个简单的Transformer模型。 Attention Is All You Need。

C++ 41 8 Updated Mar 11, 2021
C++ 2 Updated Apr 23, 2020