Skip to content
View iwzbi's full-sized avatar
‼️
panicking
‼️
panicking
  • PRC

Block or report iwzbi

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

depyf is a tool to help you understand and adapt to PyTorch compiler torch.compile.

Python 599 18 Updated Dec 7, 2024

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

6,712 197 Updated Mar 4, 2025

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 16,640 2,182 Updated Feb 1, 2025

🧑‍🚀 Bloated LunarVim 🚀

Lua 478 64 Updated Feb 23, 2025

Ring attention implementation with flash attention

Python 707 60 Updated Feb 24, 2025

Efficient LLM Inference over Long Sequences

Python 362 19 Updated Feb 14, 2025

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 2,996 233 Updated Mar 10, 2025

🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.

Python 27,941 5,745 Updated Mar 10, 2025

xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism

Python 1,467 134 Updated Mar 10, 2025

📚200+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).

Cuda 2,773 285 Updated Mar 4, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 11,667 1,187 Updated Mar 10, 2025

NCCL Tests

Cuda 1,021 264 Updated Feb 28, 2025

Optimized primitives for collective multi-GPU communication

C++ 3,536 876 Updated Jan 27, 2025

Material for gpu-mode lectures

Jupyter Notebook 3,945 398 Updated Feb 9, 2025

OpenAI Triton backend for Intel® GPUs

MLIR 167 52 Updated Mar 10, 2025

一款对应力扣的浏览器油猴插件| TamperMonkey | Chrome

SCSS 934 30 Updated Mar 10, 2025

Github精选开源项目周刊,每周一更新

956 42 Updated Mar 10, 2025

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 2,782 171 Updated Mar 7, 2025

主要记录大语言大模型(LLMs) 算法(应用)工程师相关的知识及面试题

HTML 5,947 692 Updated Oct 22, 2024

🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.

19,782 2,740 Updated Mar 8, 2025

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…

TypeScript 80,347 11,757 Updated Mar 10, 2025

Build Multimodal AI Agents with memory, knowledge and tools. Simple, fast and model-agnostic.

Python 20,404 2,716 Updated Mar 10, 2025

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 9,155 650 Updated Mar 9, 2025

[EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.

Python 4,926 278 Updated Jan 26, 2025

NO TIME TO SLEEP

Python 644 25 Updated May 26, 2024

FastAPI framework, high performance, easy to learn, fast to code, ready for production

Python 81,835 7,062 Updated Mar 10, 2025

Pipeline Parallelism for PyTorch

Python 754 88 Updated Aug 21, 2024

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

Python 8,450 1,046 Updated Mar 10, 2025

A minimal GPU design in Verilog to learn how GPUs work from the ground up

SystemVerilog 7,948 607 Updated Aug 18, 2024

📰 Must-read papers and blogs on Speculative Decoding ⚡️

628 32 Updated Mar 10, 2025
Next