Skip to content
View monellz's full-sized avatar
  • Tsinghua University
  • 19:27 (UTC +08:00)

Highlights

  • Pro

Block or report monellz

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Domain-specific language designed to streamline the development of high-performance GPU/CPU kernels

C 133 12 Updated Jan 22, 2025

Accelerated First Order Parallel Associative Scan

Python 169 8 Updated Aug 20, 2024

🚀 Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton

Python 1,773 97 Updated Jan 22, 2025
Python 22 4 Updated Dec 21, 2024

Unified KV Cache Compression Methods for Auto-Regressive Models

Python 852 151 Updated Jan 4, 2025

Infinity ∞ : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis

Python 891 32 Updated Jan 21, 2025

Helpful tools and examples for working with flex-attention

Python 593 33 Updated Jan 15, 2025

A very fast and expressive template engine.

Python 10,529 1,627 Updated Jan 14, 2025
C++ 94 26 Updated Dec 6, 2024

Quantized Attention that achieves speedups of 2.1-3.1x and 2.7-5.1x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.

Cuda 874 51 Updated Dec 28, 2024

[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…

Jupyter Notebook 6,440 420 Updated Jan 12, 2025

PKU LaTeX

JavaScript 48 4 Updated Jan 16, 2025

A web-based collaborative LaTeX editor

JavaScript 14,620 1,492 Updated Jan 22, 2025

An experimental modular OS written in Rust.

Rust 576 302 Updated Jan 7, 2025

Python interface for MLIR - the Multi-Level Intermediate Representation

Python 238 38 Updated Nov 28, 2024

A throughput-oriented high-performance serving framework for LLMs

Cuda 706 29 Updated Sep 21, 2024

😎 A complete list of Arch-based projects

HTML 459 27 Updated Sep 29, 2024

🔥 Top-Rated Web-Based Linux Server Management Tool. 1Panel features an intuitive web interface that seamlessly integrates server management and monitoring, container management, database administra…

Go 25,074 2,251 Updated Jan 22, 2025

A fast reverse proxy to help you expose a local server behind a NAT or firewall to the internet.

Go 89,634 13,641 Updated Jan 16, 2025

📺 An End-to-End Solution for High-Resolution and Long Video Generation Based on Transformer Diffusion

Python 1,693 122 Updated Jan 22, 2025

OneDiff: An out-of-the-box acceleration library for diffusion models.

Jupyter Notebook 1,761 115 Updated Jan 13, 2025

The universal proxy platform

Go 21,542 2,572 Updated Jan 19, 2025

A fast communication-overlapping library for tensor parallelism on GPUs.

C++ 276 24 Updated Oct 30, 2024
Python 22 2 Updated Jan 14, 2025

The missing pieces (as far as boilerplate reduction goes) of the upstream MLIR python bindings.

Python 75 8 Updated Jan 19, 2025

Experimental projects related to TensorRT

MLIR 86 14 Updated Jan 21, 2025

A Python Compiler Design Toolkit

Python 300 79 Updated Jan 22, 2025

PyTorch native quantization and sparsity for training and inference

Python 1,765 204 Updated Jan 22, 2025

A family of header-only, very fast and memory-friendly hashmap and btree containers.

C++ 2,717 252 Updated Jan 21, 2025

A light-weight and high-efficient training framework for accelerating diffusion tasks.

Python 45 2 Updated Sep 14, 2024
Next