Skip to content
View iwzbi's full-sized avatar
‼️
panicking
‼️
panicking
  • PRC

Block or report iwzbi

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

Projects

29 repositories

OpenVINO™ is an open source toolkit for optimizing and deploying AI inference

C++ 7,936 2,457 Updated Mar 10, 2025

AutoKernel 是一个简单易用,低门槛的自动算子优化工具,提高深度学习算法部署效率。

C++ 736 94 Updated Sep 23, 2022
Python 194 57 Updated Mar 28, 2023

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.

LLVM 31,303 12,931 Updated Mar 10, 2025

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

Python 31,593 2,929 Updated Mar 10, 2025

Open deep learning compiler stack for cpu, gpu and specialized accelerators

Python 12,085 3,528 Updated Mar 10, 2025

An Open Source Machine Learning Framework for Everyone

C++ 188,488 74,582 Updated Mar 10, 2025

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python 87,742 23,560 Updated Mar 10, 2025

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

C++ 11,297 2,164 Updated Mar 7, 2025

CUDA Templates for Linear Algebra Subroutines

C++ 7,019 1,149 Updated Feb 28, 2025

A flexible framework of neural networks for deep learning

Python 5,910 1,366 Updated Aug 28, 2023

BladeDISC is an end-to-end DynamIc Shape Compiler project for machine learning workloads.

C++ 849 164 Updated Dec 30, 2024

Development repository for the Triton language and compiler

MLIR 14,799 1,851 Updated Mar 10, 2025

Mold: A Modern Linker 🦠

C++ 14,931 491 Updated Mar 10, 2025

ncnn is a high-performance neural network inference framework optimized for the mobile platform

C++ 21,083 4,213 Updated Mar 10, 2025

A machine learning compiler for GPUs, CPUs, and ML accelerators

C++ 3,014 515 Updated Mar 10, 2025

xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism

Python 1,467 134 Updated Mar 10, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 11,677 1,189 Updated Mar 10, 2025

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 9,157 651 Updated Mar 9, 2025

[EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.

Python 4,927 278 Updated Jan 26, 2025

Pipeline Parallelism for PyTorch

Python 754 88 Updated Aug 21, 2024

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 9,654 1,138 Updated Mar 7, 2025

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 21,748 2,389 Updated Aug 12, 2024

OpenDILab Decision AI Engine. The Most Comprehensive Reinforcement Learning Framework B.P.

Python 3,284 391 Updated Mar 10, 2025

Trax — Deep Learning with Clear Code and Speed

Python 8,174 824 Updated Feb 7, 2025

Minimalist ML framework for Rust

Rust 16,757 1,049 Updated Mar 9, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 40,982 6,175 Updated Mar 10, 2025

Fast and memory-efficient exact attention

Python 16,192 1,535 Updated Mar 9, 2025

LLM inference in C/C++

C++ 76,196 11,024 Updated Mar 10, 2025