Skip to content
View galiyu's full-sized avatar

Block or report galiyu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Zstandard - Fast real-time compression algorithm

C 24,381 2,191 Updated Feb 20, 2025

Magicube is a high-performance library for quantized sparse matrix operations (SpMM and SDDMM) of deep learning on Tensor Cores.

C++ 86 17 Updated Nov 23, 2022

Triton is a dynamic binary analysis library. Build your own program analysis tools, automate your reverse engineering, perform software verification or just emulate code.

C++ 3,629 539 Updated Feb 16, 2025
C++ 27 4 Updated Jun 19, 2024

OpenPCDet Toolbox for LiDAR-based 3D Object Detection.

Python 4,843 1,320 Updated Aug 8, 2024

OpenMMLab's next-generation platform for general 3D object detection.

Python 5,526 1,581 Updated Jul 10, 2024

Kolmogorov Arnold Networks

Jupyter Notebook 15,420 1,449 Updated Jan 19, 2025

Samples for CUDA Developers which demonstrates features in CUDA Toolkit

C 6,958 1,933 Updated Feb 21, 2025
C++ 20 7 Updated Feb 17, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 38,916 5,826 Updated Feb 23, 2025

浙江大学课程攻略共享计划

HTML 37,853 9,465 Updated Feb 21, 2025

NCCL Profiling Kit

Python 127 12 Updated Jul 1, 2024

[EuroSys'24] Minuet: Accelerating 3D Sparse Convolutions on GPUs

Cuda 75 3 Updated Jun 7, 2024

Microsoft Collective Communication Library

62 6 Updated Nov 23, 2024

TACCL: Guiding Collective Algorithm Synthesis using Communication Sketches

Python 69 10 Updated Jul 25, 2023

Unified Collective Communication Library

C 227 104 Updated Feb 19, 2025

Microsoft Collective Communication Library

C++ 336 31 Updated Sep 20, 2023

Synthesizer for optimal collective communication algorithms

Python 104 25 Updated Apr 8, 2024

ROCm Communication Collectives Library (RCCL)

C++ 298 137 Updated Feb 22, 2025

Optimized primitives for collective multi-GPU communication

C++ 3,484 864 Updated Jan 27, 2025

通过修改Hosts解决国内Github经常抽风访问不到,每日更新

Java 1,529 101 Updated Jan 22, 2025

为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…

Python 67,642 8,298 Updated Feb 21, 2025

The Tensor Algebra Compiler (taco) computes sparse tensor expressions on CPUs and GPUs

C++ 1,279 190 Updated Apr 14, 2024

高性能并行编程与优化 - 课件

C++ 3,899 551 Updated Oct 18, 2024

Examples of using sparse attention, as in "Generating Long Sequences with Sparse Transformers"

Python 1,550 191 Updated Aug 12, 2020

Use tensor core to calculate back-to-back HGEMM (half-precision general matrix multiplication) with MMA PTX instruction.

Cuda 11 2 Updated Nov 3, 2023

A Vectorized N:M Format for Unleashing the Power of Sparse Tensor Cores

Python 48 7 Updated Nov 24, 2023

GraphQL framework for Python

Python 8,143 829 Updated Nov 9, 2024

北京航空航天大学 数理统计 课程学习

TeX 212 24 Updated Dec 4, 2020
Next