TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 9,276 1,089 Updated Feb 2, 2025

gpu-mode / resource-stream

GPU programming related news and material links

1,349 80 Updated Jan 6, 2025

gpu-mode / lectures

Material for gpu-mode lectures

Jupyter Notebook 3,611 366 Updated Jan 6, 2025

DefTruth / CUDA-Learn-Notes

📚200+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).

Cuda 2,190 230 Updated Jan 27, 2025

srush / GPU-Puzzles

Solve puzzles. Learn CUDA.

Jupyter Notebook 10,431 804 Updated Sep 1, 2024

GoogleCloudPlatform / python-docs-samples

Code samples used on cloud.google.com

Jupyter Notebook 7,568 6,471 Updated Jan 31, 2025

ggerganov / llama.cpp

LLM inference in C/C++

C++ 72,714 10,475 Updated Feb 2, 2025

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

Cuda 1,910 192 Updated Feb 1, 2025

efeslab / Nanoflow

A throughput-oriented high-performance serving framework for LLMs

Cuda 715 29 Updated Sep 21, 2024

Lightning-AI / LitServe

Lightning-fast serving engine for any AI model of any size. Flexible. Easy. Enterprise-scale.

Python 2,807 181 Updated Jan 21, 2025

rust-unofficial / awesome-rust

A curated list of Rust code and resources.

Rust 48,450 2,833 Updated Jan 19, 2025

ccfddl / ccf-deadlines

⏰ Collaboratively track deadlines of conferences recommended by CCF (Website, Python Cli, Wechat Applet) / If you find it useful, please star this project, thanks~

Vue 6,812 467 Updated Feb 1, 2025

naklecha / llama3-from-scratch

llama3 implementation one matrix multiplication at a time

Jupyter Notebook 14,083 1,155 Updated May 23, 2024

adam-maj / tiny-gpu

A minimal GPU design in Verilog to learn how GPUs work from the ground up

SystemVerilog 7,382 571 Updated Aug 18, 2024

NVIDIA / cutlass

CUDA Templates for Linear Algebra Subroutines

C++ 6,104 1,053 Updated Feb 2, 2025

Wenhao Xie xwhzz

Highlights

Starred repositories

Python

OCaml

C++

Rust