Stars
Notepad++ official repository
RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.
Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators
A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")
A fast communication-overlapping library for tensor parallelism on GPUs.
chipStar is a tool for compiling and running HIP/CUDA on SPIR-V via OpenCL or Level Zero APIs.
Fast Matrix Multiplications for Lookup Table-Quantized LLMs
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
Online compiler for HIP and NVIDIA® CUDA® code to WebGPU