Lists (7)
Sort Name ascending (A-Z)
Starred repositories
(Unofficial) PyTorch implementation of grouped-query attention (GQA) from "GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints" (https://arxiv.org/pdf/2305.13245.pdf)
A high-throughput and memory-efficient inference and serving engine for LLMs
An extension for using Cursor in Visual Studio Code.
A large-scale RWKV v6, v7 inference. Capable of inference by combining multiple states(Pseudo MoE). Easy to deploy on docker. Supports true multi-batch generation and dynamic State switching. CUDA …
vLLM Documentation in Chinese Simplified / vLLM 中文文档
TVM Documentation in Chinese Simplified / TVM 中文文档
Triton Documentation in Chinese Simplified / Triton 中文文档
LLM notes, including model inference, transformer model structure, and llm framework code analysis notes.
Development repository for the Triton language and compiler
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
[NeurIPS 2024] Official implementation of the paper "Are Self-Attentions Effective for Time Series Forecasting?"
This is an official implementation of "DeformableTST: Transformer for Time Series Forecasting without Over-reliance on Patching" (NeurIPS 2024)
Official Implementation of "From Similarity to Superiority: Channel Clustering for Time Series Forecasting"
Official PyTorch Implementation of Gated Delta Networks: Improving Mamba2 with Delta Rule
[KDD 2025] DUET: Dual Clustering Enhanced Multivariate Time Series Forecasting
Official Code for "How Much Can Time-related Features Enhance Time Series Forecasting?"
Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton
The official code for "One Fits All: Power General Time Series Analysis by Pretrained LM (NeurIPS 2023 Spotlight)"
Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton
RWKV-7: Surpassing GPT
uniartisan / TorchRWKV
Forked from yuunnn-w/RWKV_PytorchRWKV6 in native pytorch and triton:)
Fast & Simple repository for pre-training and fine-tuning T5-style models
Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures