yaox12

🐳

Slacking

Xin Yao yaox12

🐳

Slacking

94 followers · 34 following

Achievements

x3 x2

Achievements

x3 x2

Stars

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

Cuda 2,321 241 Updated Mar 9, 2025

sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.

Python 11,645 1,185 Updated Mar 10, 2025

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 40,890 6,155 Updated Mar 10, 2025

pyutils / line_profiler

Line-by-line profiling for Python

Python 2,882 125 Updated Jan 30, 2025

NVIDIA / nvidia-resiliency-ext

NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the effective training time by minimizing the downtime due to fa…

Python 95 9 Updated Feb 21, 2025

chemharuka / toGainMapHDR

A tool to convert HDR file to Adaptive HDR (Gain Map HDR) and ISO HDR format in HEIC

Swift 48 2 Updated Jan 14, 2025

Dao-AILab / flash-attention

Fast and memory-efficient exact attention

Python 16,176 1,532 Updated Mar 9, 2025

huggingface / safetensors

Simple, safe way to store and distribute tensors

Python 3,155 229 Updated Mar 5, 2025

NVIDIA / cccl

CUDA Core Compute Libraries

C++ 1,509 200 Updated Mar 10, 2025

typst / typst

A new markup-based typesetting system that is powerful and easy to learn.

Rust 38,200 1,047 Updated Mar 7, 2025

NVIDIA / Megatron-LM

Ongoing research training transformer models at scale

Python 11,680 2,621 Updated Mar 8, 2025

huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 140,890 28,229 Updated Mar 8, 2025

NVIDIA / Fuser

A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")

C++ 313 55 Updated Mar 10, 2025

NVIDIA / NVTX

The NVIDIA® Tools Extension SDK (NVTX) is a C-based Application Programming Interface (API) for annotating events, code ranges, and resources in your applications.

C++ 354 50 Updated Mar 5, 2025

asottile / pyupgrade

A tool (and pre-commit hook) to automatically upgrade syntax for newer versions of the language.

Python 3,719 188 Updated Feb 17, 2025

nv-legate / cupynumeric

An Aspiring Drop-In Replacement for NumPy at Scale

Python 828 81 Updated Mar 5, 2025

CVCUDA / CV-CUDA

CV-CUDA™ is an open-source, GPU accelerated library for cloud-scale image processing and computer vision.

C++ 2,457 222 Updated Mar 3, 2025

wjakob / nanobind

nanobind: tiny and efficient C++/Python bindings

C++ 2,636 220 Updated Mar 2, 2025

suo / lintrunner

Rust 28 15 Updated Dec 7, 2024

StaZhu / enable-chromium-hevc-hardware-decoding

A guide that teach you enable hardware HEVC decoding & encoding for Chrome / Edge, or build a custom version of Chromium / Electron that supports hardware & software HEVC decoding and hardware HEVC…

JavaScript 1,293 63 Updated Feb 28, 2025

arogozhnikov / einops

Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)

Python 8,787 361 Updated Feb 9, 2025

openucx / ucc

Unified Collective Communication Library

C 231 105 Updated Mar 6, 2025

openucx / ucx

Unified Communication X (mailing list - https://elist.ornl.gov/mailman/listinfo/ucx-group)

C 1,237 441 Updated Mar 9, 2025

NVIDIA-Merlin / HierarchicalKV

HierarchicalKV is a part of NVIDIA Merlin and provides hierarchical key-value storage to meet RecSys requirements. The key capability of HierarchicalKV is to store key-value feature-embeddings on h…

Cuda 140 27 Updated Mar 2, 2025

NVIDIA / TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilizatio…

Python 2,249 377 Updated Mar 9, 2025

rapidsai / wholegraph

WholeGraph - large scale Graph Neural Networks

Cuda 101 38 Updated Nov 25, 2024

NVIDIA / libcudacxx

[ARCHIVED] The C++ Standard Library for your entire system. See https://github.com/NVIDIA/cccl

C++ 2,297 186 Updated Feb 7, 2024

NVIDIA / thrust

[ARCHIVED] The C++ parallel algorithms library. See https://github.com/NVIDIA/cccl

C++ 4,953 756 Updated Feb 8, 2024

llvm / torch-mlir

The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.

C++ 1,460 538 Updated Mar 10, 2025

pytorch / torchdynamo

A Python-level JIT compiler designed to make unmodified PyTorch programs faster.

Python 1,034 125 Updated Apr 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Xin Yao yaox12

Achievements

Achievements

Block or report yaox12

Stars

flashinfer-ai / flashinfer

sgl-project / sglang

vllm-project / vllm

pyutils / line_profiler

NVIDIA / nvidia-resiliency-ext

chemharuka / toGainMapHDR

Dao-AILab / flash-attention

huggingface / safetensors

NVIDIA / cccl

typst / typst

NVIDIA / Megatron-LM

huggingface / transformers

NVIDIA / Fuser

NVIDIA / NVTX

asottile / pyupgrade

nv-legate / cupynumeric

CVCUDA / CV-CUDA

wjakob / nanobind

suo / lintrunner

StaZhu / enable-chromium-hevc-hardware-decoding

arogozhnikov / einops

openucx / ucc

openucx / ucx

NVIDIA-Merlin / HierarchicalKV

NVIDIA / TransformerEngine

rapidsai / wholegraph

NVIDIA / libcudacxx

NVIDIA / thrust

llvm / torch-mlir

pytorch / torchdynamo