tobehuang

tobehuang

3 followers · 0 following

Stars

deepseek-ai / DeepSeek-V3

Python 82,350 13,114 Updated Feb 8, 2025

deepseek-ai / DeepSeek-R1

71,542 9,200 Updated Feb 8, 2025

liguodongiot / llm-action

本项目旨在分享大模型相关技术原理以及实战经验（大模型工程化、大模型应用落地）

HTML 13,942 1,592 Updated Jan 15, 2025

wang-xinyu / tensorrtx

Implementation of popular deep learning networks with TensorRT network definition API

C++ 7,168 1,797 Updated Dec 6, 2024

NVIDIA / trt-samples-for-hackathon-cn

Simple samples for TensorRT programming

Python 1,570 344 Updated Dec 18, 2024

NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

C++ 11,156 2,152 Updated Feb 1, 2025

huggingface / safetensors

Simple, safe way to store and distribute tensors

Python 3,087 218 Updated Feb 6, 2025

microsoft / T-MAC

Low-bit LLM inference on CPU with lookup table

C++ 667 50 Updated Jan 9, 2025

sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.

Python 9,208 879 Updated Feb 11, 2025

Dao-AILab / flash-attention

Fast and memory-efficient exact attention

Python 15,395 1,450 Updated Feb 10, 2025

NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 9,366 1,096 Updated Feb 11, 2025

alibaba / rtp-llm

RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.

C++ 621 55 Updated Jan 21, 2025

deepinsight / insightface

State-of-the-art 2D and 3D Face Analysis Project

Python 24,218 5,486 Updated Dec 5, 2024

ggml-org / ggml

Tensor library for machine learning

C++ 11,807 1,115 Updated Feb 9, 2025

owenliang / qwen-vllm

通义千问VLLM推理部署DEMO

Python 507 74 Updated Mar 28, 2024

kvcache-ai / Mooncake

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 2,486 147 Updated Feb 8, 2025

siliconflow / onediff

OneDiff: An out-of-the-box acceleration library for diffusion models.

Jupyter Notebook 1,805 118 Updated Jan 13, 2025

microsoft / LLMLingua

[EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.

Python 4,852 271 Updated Jan 26, 2025

li-plus / chatglm.cpp

C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)

C++ 2,964 334 Updated Jul 31, 2024

OFA-Sys / Chinese-CLIP

Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.

Python 4,846 481 Updated Aug 6, 2024

modelscope / dash-infer

DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including CUDA, x86 and ARMv9.

C 224 24 Updated Feb 11, 2025

pybind / pybind11

Seamless operability between C++11 and Python

C++ 16,145 2,129 Updated Feb 8, 2025

acai66 / opencv_matching

A template matching library based on OpenCV, supporting rotation matching, cross-platform usage, C++, and Python. 基于opencv的模板匹配库，支持旋转匹配，支持跨平台、c++调用、python调用

C++ 34 9 Updated May 19, 2024

onnx / onnx

Open standard for machine learning interoperability

Python 18,392 3,711 Updated Feb 9, 2025

modelscope / ms-swift

Use PEFT or Full-parameter to finetune 450+ LLMs (Qwen2.5, InternLM3, GLM4, Llama3.3, Mistral, Yi1.5, Baichuan2, DeepSeek-R1, ...) and 150+ MLLMs (Qwen2.5-VL, Qwen2-Audio, Llama3.2-Vision, Llava, I…

Python 5,323 462 Updated Feb 11, 2025

pytorch / serve

Serve, optimize and scale PyTorch models in production

Java 4,289 871 Updated Feb 6, 2025

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 37,245 5,605 Updated Feb 11, 2025

huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 138,983 27,889 Updated Feb 10, 2025

ai-shifu / ChatALL

Concurrently chat with ChatGPT, Bing Chat, Bard, Alpaca, Vicuna, Claude, ChatGLM, MOSS, 讯飞星火, 文心一言 and more, discover the best answers

JavaScript 15,525 1,657 Updated Feb 6, 2025

ggerganov / llama.cpp

LLM inference in C/C++

C++ 73,826 10,648 Updated Feb 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly