Skip to content
View ssbandjl's full-sized avatar

Block or report ssbandjl

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 7,055 596 Updated Mar 4, 2025

Libtpa(Transport Protocol Acceleration), a DPDK based userspace TCP stack implementation.

C 115 16 Updated Mar 19, 2024

RoCEv2 hardware implementation in Bluespec SystemVerilog

Bluespec 22 13 Updated Sep 12, 2024

DeepEP: an efficient expert-parallel communication library

Cuda 6,926 579 Updated Mar 4, 2025

FlashMLA: Efficient MLA decoding kernels

C++ 11,064 753 Updated Mar 1, 2025

NCCL Tests

Cuda 1,015 263 Updated Feb 28, 2025

Official MPICH Repository

C 589 286 Updated Mar 4, 2025

Open MPI main development repository

C 2,281 886 Updated Feb 24, 2025

IOR and mdtest

C 411 173 Updated Feb 26, 2025

High performance container overlay networks on Linux. Enabling RDMA (on both InfiniBand and RoCE) and accelerating TCP to bare metal performance. Freeflow requires zero modification on application …

C 617 93 Updated Jun 12, 2023

Notes taken by zweix while learning computer related knowledge

127 6 Updated Jan 8, 2025

cloud-native distributed storage

Go 4,927 690 Updated Mar 3, 2025

High performance RDMA-based distributed feature collection component for training GNN model on EXTREMELY large graph

C++ 51 6 Updated Jul 3, 2022

A Toolchain to make Build and Run eBPF programs easier

Rust 720 65 Updated Sep 5, 2024
C 2 Updated Dec 3, 2024

AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术

Jupyter Notebook 12,694 1,827 Updated Mar 1, 2025

framework for emulating devices in userspace

C 174 53 Updated Aug 19, 2024

《深入理解Linux进程与内存》一书的配套源码以及勘误列表

C 102 27 Updated Feb 16, 2025

Contains the source code examples described in the "Intel® 64 and IA-32 Architectures Optimization Reference Manual"

Assembly 788 91 Updated May 3, 2024

Simple shell implementation. Tutorial here ->

C 1,550 348 Updated Aug 2, 2022

First-Class GPU Resource Management: Device Drivers, Runtimes, and CUDA Compilers for Nouveau.

C 361 73 Updated Nov 10, 2014

NVIDIA Linux open GPU kernel module source

C 15,588 1,353 Updated Mar 3, 2025

《动手学深度学习》:面向中文读者、能运行、可讨论。中英文版被70多个国家的500多所大学用于教学。

Python 66,351 11,303 Updated Jul 30, 2024

The reference implementation of the Linux FUSE (Filesystem in Userspace) interface

C 5,525 1,160 Updated Mar 3, 2025

Storage Performance Development Kit

C 4 1 Updated Aug 14, 2024

Magnum IO community repo

C++ 84 17 Updated Jan 21, 2025

oneAPI Collective Communications Library (oneCCL)

C++ 223 73 Updated Jan 23, 2025
Next