suhao2

Follow

suhao2

Follow

Stars

6 stars written in Cuda

Bruce-Lee-LY / cuda_hgemm

Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.

Cuda 353 72 Updated Sep 8, 2024

wzsh / wmma_tensorcore_sample

Matrix Multiply-Accumulate with CUDA and WMMA( Tensor Core)

Cuda 124 19 Updated Aug 18, 2020

unw9527 / ECE408

ECE408 (Applied Parallel Programming) Fall 2022 MP

Cuda 10 5 Updated Mar 24, 2023

yutong-xie / ECE408-Applied-Parallel-Programming

19FA ECE408 MP&Project

Cuda 8 3 Updated Jun 22, 2020

Violet24K / ECE408_project

UIUC ECE408 Fall 2021 Project

Cuda 4 3 Updated Jun 3, 2022

minosys-jp / wmma

CUDA WMMA test project

Cuda 3 2 Updated Jun 14, 2018