Skip to content
View reyoung's full-sized avatar
  • Tencent
  • Beijing

Block or report reyoung

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Tutel MoE: An Optimized Mixture-of-Experts Implementation

Python 753 94 Updated Jan 18, 2025

MSCCL++: A GPU-driven communication stack for scalable AI applications

C++ 293 44 Updated Feb 5, 2025

A throughput-oriented high-performance serving framework for LLMs

Cuda 720 28 Updated Sep 21, 2024

Synchronization and asynchronous computation package for Go

Go 236 11 Updated Sep 7, 2024

A declarative drawing API in Python

Python 293 13 Updated Aug 28, 2024

[ICLR2025] Official Implementation of TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters

Python 485 39 Updated Feb 6, 2025

Borgo is a statically typed language that compiles to Go.

Rust 4,352 60 Updated Oct 27, 2024

🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.

TypeScript 23,633 1,922 Updated Feb 6, 2025

GLake: optimizing GPU memory management and IO transmission.

Python 426 35 Updated Nov 27, 2024

A fast inference library for running LLMs locally on modern consumer-class GPUs

Python 3,923 299 Updated Feb 4, 2025

Ring attention implementation with flash attention

Python 668 59 Updated Dec 19, 2024

Ant game engine

Lua 3,884 401 Updated Feb 5, 2025

An event based XML parsing API for Go

Go 20 5 Updated Jun 19, 2014

Implementation of MagViT2 Tokenizer in Pytorch

Python 587 33 Updated Jan 12, 2025
Python 1,256 179 Updated Nov 20, 2024

Chat凉宫春日, An open sourced Role-Playing chatbot Cheng Li, Ziang Leng, and others.

Jupyter Notebook 1,893 172 Updated Aug 13, 2024

Cross-platform terminal tunnel tool

C 357 35 Updated May 21, 2024

XVERSE-13B: A multilingual large language model developed by XVERSE Technology Inc.

Python 648 59 Updated Apr 9, 2024

Minimalist ML framework for Rust

Rust 16,487 1,018 Updated Feb 4, 2025

C++ exception handling library

C++ 39 14 Updated Jan 17, 2024

Development repository for the Triton language and compiler

C++ 14,283 1,772 Updated Feb 6, 2025

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 8,981 641 Updated Feb 5, 2025

Implementation of a Transformer, but completely in Triton

Python 255 15 Updated Apr 5, 2022

Unsupervised text tokenizer focused on computational efficiency

C++ 963 103 Updated Mar 29, 2024

Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.

Jupyter Notebook 1,554 95 Updated Feb 16, 2024

Efficient cache for gigabytes of data written in Go.

Go 7,673 602 Updated Jan 20, 2025

A library to analyze PyTorch traces.

Python 327 47 Updated Feb 6, 2025

asyncio is a c++20 library to write concurrent code using the async/await syntax.

C++ 860 82 Updated Feb 3, 2024

Convert C to Go

Go 303 20 Updated May 18, 2024
Next