Source code of the PPoPP '22 paper: "TileSpGEMM: A Tiled Algorithm for Parallel Sparse General Matrix-Matrix Multiplication on GPUs" by Yuyao Niu, Zhengyang Lu, Haonan Ji, Shuhui Song, Zhou Jin, an…

C 39 8 Updated May 22, 2024

monmonmons / PRA2022_SPGEMM

Code of implementation of optimisation of kernel function SpGEMM on DCU.

C++ 1 Updated Mar 5, 2023

lorentzbf / BRMerge

The source code of the paper "Accelerating CPU-based Sparse General Matrix Multiplication with Binary Row Merging"

C++ 4 Updated Aug 23, 2022

GPUPeople / spECK

Efficient SpGEMM on GPU using CUDA and CSR

Cuda 52 15 Updated Jul 18, 2023

weifengliu-ssslab / Benchmark_SpGEMM_using_CSR

CSR-based SpGEMM on nVidia and AMD GPUs

C++ 45 8 Updated Apr 9, 2016

ernie55ernie / HipMCL

This repository is obtained from https://bitbucket.org/azadcse/hipmcl/src

C++ 1 Updated Sep 2, 2023

DrTimothyAldenDavis / GraphBLAS

SuiteSparse:GraphBLAS: graph algorithms in the language of linear algebra. For production: (default) STABLE branch. Code development: ask me for the right branch before submitting a PR. video intro…

C 372 68 Updated Mar 18, 2025

mtodat / ms-bfs

Source code for VLDB 2015 paper "The More the Merrier: Efficient Multi-Source Graph Traversal"

C++ 24 11 Updated Sep 22, 2015

DTolm / VkFFT

Vulkan/CUDA/HIP/OpenCL/Level Zero/Metal Fast Fourier Transform library

C++ 1,604 101 Updated Mar 17, 2025

xuqiantong / CUDA-Winograd

Fast CUDA Kernels for ResNet Inference.

Cuda 173 47 Updated May 26, 2019

StephanPreibisch / FourierConvolutionCUDALib

Implementation of 3d non-separable convolution using CUDA & FFT Convolution

C++ 20 13 Updated Jan 15, 2019

abhinav-vaishya / Fast-Training-of-Convolutional-Networks-through-FFTs

Implementation of the paper - Fast Training of Convolutional Networks through FFTs (CUDA for parallelization)

Jupyter Notebook 10 1 Updated May 8, 2020

andravin / wincnn

Winograd minimal convolution algorithm generator for convolutional neural networks.

Python 613 145 Updated Oct 17, 2020

ROCm / MIOpen

AMD's Machine Intelligence Library

Assembly 1,130 243 Updated Mar 20, 2025

lattice / quda

QUDA is a library for performing calculations in lattice QCD on GPUs.

C++ 307 106 Updated Mar 20, 2025

piojanu / CUDA-im2col-conv

CUDA project for uni subject

Jupyter Notebook 23 2 Updated Oct 26, 2020

vetter / shoc

The SHOC Benchmark Suite

Makefile 250 103 Updated Jan 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QAQhh

Block or report QAQhh

Stars

NervanaSystems / maxas

siboehm / SGEMM_CUDA

CRAFT-THU / RoDe

lixiuhong / implicit_gemm_convolution

Cjkkkk / CUDA_gemm

Qwesh157 / conv_op_optimization

kterakura / winograd_cuda

md2z34 / winograd_gpu

UDC-GAC / openCNN

google-research / sputnik

dgSPARSE / dgSPARSE-Lib

hgyhungry / ge-spmm

LucasWilkinson / ASpT-mirror

SuperScientificSoftwareLaboratory / TileSpGEMM