Skip to content
View ChrisGao001's full-sized avatar

Block or report ChrisGao001

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

📖A curated list of Awesome LLM/VLM Inference Papers with codes: WINT8/4, Flash-Attention, Paged-Attention, Parallelism, etc. 🎉🎉

3,635 254 Updated Mar 4, 2025

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 5,837 507 Updated Mar 13, 2025

纯c++的全平台llm加速库,支持python调用,chatglm-6B级模型单卡可达10000+token / s,支持glm, llama, moss基座,手机端流畅运行

C++ 3,430 351 Updated Mar 12, 2025

OpenVINO™ is an open source toolkit for optimizing and deploying AI inference

C++ 7,966 2,466 Updated Mar 13, 2025
Python 194 57 Updated Mar 28, 2023

HierarchicalKV is a part of NVIDIA Merlin and provides hierarchical key-value storage to meet RecSys requirements. The key capability of HierarchicalKV is to store key-value feature-embeddings on h…

Cuda 140 27 Updated Mar 2, 2025

ONNXMLTools enables conversion of models to ONNX

Python 1,055 192 Updated Jan 8, 2025

compiler learning resources collect.

Python 2,308 342 Updated May 27, 2024

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

C++ 11,315 2,166 Updated Mar 11, 2025

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python 87,852 23,578 Updated Mar 13, 2025

blibili

Go 3 Updated Nov 29, 2019

HugeCTR is a high efficiency GPU framework designed for Click-Through-Rate (CTR) estimating training

C++ 977 200 Updated Mar 13, 2025

后台ui大全(有这些你就够了)https://blog.csdn.net/m0_37499059/article/details/80519211

95 78 Updated May 31, 2018

Visualizer for neural network, deep learning and machine learning models

JavaScript 29,628 2,862 Updated Mar 12, 2025

The official GitHub mirror of the Chromium source

C++ 20,114 7,410 Updated Mar 13, 2025

A tiny compiler for a language featuring LL(2) with Lexer, Parser, ASM-like codegen and VM. Complex enough to give you a flavour of how the "real" thing works whilst not being a mere toy example

C 565 46 Updated Mar 20, 2023

This repository includes tutorials on how to use the TensorFlow estimator APIs to perform various ML tasks, in a systematic and standardised way

Jupyter Notebook 671 234 Updated Aug 20, 2024

图解tensorflow 源码

2,180 597 Updated Nov 11, 2016

Distributed vector search for AI-native applications

Go 2,140 342 Updated Mar 13, 2025

A raft consensus implementation that is simply and understandable

C++ 152 51 Updated Jun 24, 2020

时间轮定时器

Go 268 74 Updated Dec 23, 2023

pkuseg多领域中文分词工具; The pkuseg toolkit for multi-domain Chinese word segmentation

Python 6,596 991 Updated Nov 5, 2022

Mission: To provide a high-quality open content data structures textbook that is both mathematically rigorous and provides complete implementations.

TeX 1,223 247 Updated Feb 2, 2022

计算机领域经典书籍

137 46 Updated Apr 15, 2019

FlatBuffers: Memory Efficient Serialization Library

C++ 23,904 3,311 Updated Feb 11, 2025

A powerful flow control component enabling reliability, resilience and monitoring for microservices. (面向云原生微服务的高可用流控防护组件)

Java 22,625 8,078 Updated Oct 24, 2024

Google core libraries for Java

Java 50,575 10,964 Updated Mar 11, 2025

simple demo for network programming

C 38 23 Updated Feb 9, 2017

oneAPI Threading Building Blocks (oneTBB)

C++ 5,985 1,065 Updated Mar 13, 2025
Next