fxmarty-amd

fxmarty-amd

1 follower · 12 following

Stars

16 stars written in C++

Clear filter

notepad-plus-plus / notepad-plus-plus

Notepad++ official repository

C++ 23,766 4,696 Updated Feb 9, 2025

alibaba / rtp-llm

RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.

C++ 622 55 Updated Jan 21, 2025

amd / RyzenAI-SW

C++ 467 70 Updated Dec 10, 2024

ROCm / composable_kernel

Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators

C++ 340 146 Updated Feb 12, 2025

NVIDIA / Fuser

A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")

C++ 300 55 Updated Feb 12, 2025

bytedance / flux

A fast communication-overlapping library for tensor parallelism on GPUs.

C++ 292 25 Updated Oct 30, 2024

CHIP-SPV / chipStar

chipStar is a tool for compiling and running HIP/CUDA on SPIR-V via OpenCL or Level Zero APIs.

C++ 248 35 Updated Feb 6, 2025

HanGuo97 / flute

Fast Matrix Multiplications for Lookup Table-Quantized LLMs

C++ 227 8 Updated Feb 12, 2025

tile-ai / tilelang

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ 211 17 Updated Feb 11, 2025

lights0123 / hipscript

Online compiler for HIP and NVIDIA® CUDA® code to WebGPU

C++ 137 1 Updated Jan 8, 2025

ROCm / hipBLAS

ROCm BLAS marshalling library

C++ 131 81 Updated Feb 11, 2025

ROCm / clr

C++ 117 54 Updated Feb 11, 2025

INT-FlashAttention2024 / INT-FlashAttention

C++ 59 3 Updated Jan 23, 2025

ROCm / amdsmi

AMD SMI

C++ 53 31 Updated Feb 12, 2025

ROCm / hipTensor

AMD’s C++ library for accelerating tensor primitives

C++ 37 21 Updated Feb 5, 2025

shenmishajing / pytorch_extension_example

C++ 8 Updated Jul 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly