Skip to content
View Codedestructor56's full-sized avatar

Block or report Codedestructor56

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
10 stars written in Cuda
Clear filter

LLM training in simple, raw C/CUDA

Cuda 26,651 3,063 Updated May 10, 2025

📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA/Tensor Cores Kernels, HGEMM, FA-2 MMA etc.🔥

Cuda 4,428 467 Updated May 17, 2025

how to optimize some algorithm in cuda.

Cuda 2,200 192 Updated May 23, 2025

Learn CUDA Programming, published by Packt

Cuda 1,143 254 Updated Dec 30, 2023

Examples demonstrating available options to program multiple GPUs in a single node or a cluster

Cuda 710 125 Updated Feb 21, 2025

A set of hands-on tutorials for CUDA programming

Cuda 221 33 Updated Apr 8, 2024

CUDA Matrix Multiplication Optimization

Cuda 187 20 Updated Jul 19, 2024

HierarchicalKV is a part of NVIDIA Merlin and provides hierarchical key-value storage to meet RecSys requirements. The key capability of HierarchicalKV is to store key-value feature-embeddings on h…

Cuda 147 28 Updated May 21, 2025

Benchmark tests supporting the TiledCUDA library.

Cuda 16 2 Updated Nov 19, 2024