Skip to content
View ChangyongYang's full-sized avatar

Block or report ChangyongYang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

(Unofficial) PyTorch implementation of grouped-query attention (GQA) from "GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints" (https://arxiv.org/pdf/2305.13245.pdf)

Python 141 8 Updated May 9, 2024
Python 2 Updated Aug 13, 2024

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 33,202 5,049 Updated Jan 6, 2025

An extension for using Cursor in Visual Studio Code.

TypeScript 1,732 77 Updated Oct 8, 2024

A large-scale RWKV v6, v7 inference. Capable of inference by combining multiple states(Pseudo MoE). Easy to deploy on docker. Supports true multi-batch generation and dynamic State switching. CUDA …

Python 21 1 Updated Dec 23, 2024

vLLM Documentation in Chinese Simplified / vLLM 中文文档

TypeScript 16 1 Updated Dec 14, 2024

TVM Documentation in Chinese Simplified / TVM 中文文档

TypeScript 974 156 Updated Jan 6, 2025

Triton Documentation in Chinese Simplified / Triton 中文文档

TypeScript 51 4 Updated Dec 11, 2024

LLM notes, including model inference, transformer model structure, and llm framework code analysis notes.

Python 338 32 Updated Jan 5, 2025

Development repository for the Triton language and compiler

C++ 13,900 1,693 Updated Jan 6, 2025

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

Python 8,549 1,505 Updated Jan 6, 2025

[NeurIPS 2024] Official implementation of the paper "Are Self-Attentions Effective for Time Series Forecasting?"

Python 27 3 Updated Dec 31, 2024

This is an official implementation of "DeformableTST: Transformer for Time Series Forecasting without Over-reliance on Patching" (NeurIPS 2024)

Python 6 Updated Oct 30, 2024
Jupyter Notebook 55 13 Updated Jan 2, 2025

Official Implementation of "From Similarity to Superiority: Channel Clustering for Time Series Forecasting"

Python 22 1 Updated Oct 30, 2024

Official PyTorch Implementation of Gated Delta Networks: Improving Mamba2 with Delta Rule

Python 61 6 Updated Dec 31, 2024

[KDD 2025] DUET: Dual Clustering Enhanced Multivariate Time Series Forecasting

Python 36 6 Updated Jan 4, 2025

Official Code for "How Much Can Time-related Features Enhance Time Series Forecasting?"

Shell 19 3 Updated Nov 27, 2024

Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton

Python 14 Updated Jan 3, 2025
Python 52 14 Updated Nov 22, 2024

The official code for "One Fits All: Power General Time Series Analysis by Pretrained LM (NeurIPS 2023 Spotlight)"

Python 492 71 Updated Jan 8, 2024

Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton

Python 1,577 79 Updated Jan 6, 2025

NanoGPT (124M) in 3.4 minutes

Python 2,014 198 Updated Jan 6, 2025

RWKV-7: Surpassing GPT

Python 68 5 Updated Nov 17, 2024

RWKV6 in native pytorch and triton:)

Python 11 Updated Aug 4, 2024
Python 14 Updated Aug 1, 2023

Fast & Simple repository for pre-training and fine-tuning T5-style models

Python 987 74 Updated Aug 21, 2024

Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures

Python 389 16 Updated Oct 31, 2024
Next