Skip to content
View kaixih's full-sized avatar

Block or report kaixih

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

📚200+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).

Cuda 2,283 244 Updated Feb 7, 2025

Material for gpu-mode lectures

Jupyter Notebook 3,683 371 Updated Feb 9, 2025

JAX-Toolbox

Jupyter Notebook 280 56 Updated Feb 13, 2025

Book_2_《可视之美》 | 鸢尾花书:从加减乘除到机器学习,欢迎批评指正

Jupyter Notebook 3,013 624 Updated Sep 11, 2024

Book_3_《数学要素》 | 鸢尾花书:从加减乘除到机器学习;上架;欢迎继续纠错,纠错多的同学还会有赠书!

Jupyter Notebook 6,722 1,180 Updated Jan 26, 2025

Book_4_《矩阵力量》 | 鸢尾花书:从加减乘除到机器学习;上架!

Jupyter Notebook 9,057 1,373 Updated Feb 1, 2025

Book_6_《数据有道》 | 鸢尾花书:从加减乘除到机器学习;欢迎大家批评指正!纠错多的同学会得到赠书感谢!

Jupyter Notebook 2,199 407 Updated Sep 11, 2024

Book_7_《机器学习》 | 鸢尾花书:从加减乘除到机器学习;欢迎批评指正

Jupyter Notebook 2,658 507 Updated Sep 11, 2024

Book_1_《编程不难》 | 鸢尾花书:从加减乘除到机器学习;请多多批评指正!

Jupyter Notebook 5,274 1,035 Updated Sep 11, 2024

Book_5_《统计至简》 | 鸢尾花书:从加减乘除到机器学习;上架!

Jupyter Notebook 3,129 646 Updated Feb 6, 2025
MLIR 48 19 Updated Mar 5, 2024

Experiments and prototypes associated with IREE or MLIR

MLIR 51 47 Updated Aug 9, 2024

AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术

Jupyter Notebook 12,286 1,780 Updated Jan 2, 2025

A machine learning compiler for GPUs, CPUs, and ML accelerators

C++ 2,945 499 Updated Feb 13, 2025

Chinese translation of Bjarne Stroustrup's HOPL4 paper

2,243 400 Updated Dec 10, 2024
Shell 8,196 1,130 Updated Feb 12, 2025

A feature-rich command-line audio/video downloader

Python 99,923 7,824 Updated Feb 11, 2025

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilizatio…

Python 2,159 360 Updated Feb 12, 2025

Plot in the terminal using braille dots.

Python 433 20 Updated Jan 22, 2025

Oh my tmux! My self-contained, pretty & versatile tmux configuration made with 💛🩷💙🖤❤️🤍

Shell 22,538 3,395 Updated Jan 19, 2025

Bash Line Editor―a line editor written in pure Bash with syntax highlighting, auto suggestions, vim modes, etc. for Bash interactive sessions.

Shell 2,940 84 Updated Feb 9, 2025

A delightful community-driven framework for managing your bash configuration, and an auto-update tool so that makes it easy to keep up with the latest updates from the community.

Shell 6,279 690 Updated Jan 27, 2025

A collection of my config files.

Shell 590 218 Updated Dec 29, 2024

A TensorFlow Extension: GPU performance tools for TensorFlow.

Python 25 7 Updated Jul 27, 2023

A visualization tool to display TF-Grappler optimized op graph

Python 12 5 Updated Aug 31, 2022

[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl

Cuda 1,715 448 Updated Oct 9, 2023

[ARCHIVED] The C++ parallel algorithms library. See https://github.com/NVIDIA/cccl

C++ 4,945 757 Updated Feb 8, 2024

Fast integer division with divisor not known at compile time. To be used primarily in CUDA kernels.

Cuda 71 9 Updated Nov 4, 2015

🚀An automatic configuration program for vim

Vim Script 3,950 1,134 Updated Jun 5, 2024
Next