iwzbi

‼️

panicking

Wenzheng Bi iwzbi

‼️

panicking

Nothing is true. Everything is permitted.

2 followers · 24 following

Achievements

Lists (8)

Sort

Stars

thuml / depyf

depyf is a tool to help you understand and adapt to PyTorch compiler torch.compile.

Python 599 18 Updated Dec 7, 2024

deepseek-ai / open-infra-index

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

6,712 197 Updated Mar 4, 2025

deepseek-ai / Janus

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 16,640 2,182 Updated Feb 1, 2025

abzcoding / lvim

🧑‍🚀 Bloated LunarVim 🚀

Lua 478 64 Updated Feb 23, 2025

zhuzilin / ring-flash-attention

Ring attention implementation with flash attention

Python 707 60 Updated Feb 24, 2025

NVIDIA / Star-Attention

Efficient LLM Inference over Long Sequences

Python 362 19 Updated Feb 14, 2025

ModelTC / lightllm

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 2,996 233 Updated Mar 10, 2025

huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.

Python 27,941 5,745 Updated Mar 10, 2025

xdit-project / xDiT

xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism

Python 1,467 134 Updated Mar 10, 2025

DefTruth / CUDA-Learn-Notes

📚200+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).

Cuda 2,773 285 Updated Mar 4, 2025

sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.

Python 11,667 1,187 Updated Mar 10, 2025

NVIDIA / nccl-tests

NCCL Tests

Cuda 1,021 264 Updated Feb 28, 2025

NVIDIA / nccl

Optimized primitives for collective multi-GPU communication

C++ 3,536 876 Updated Jan 27, 2025

gpu-mode / lectures

Material for gpu-mode lectures

Jupyter Notebook 3,945 398 Updated Feb 9, 2025

intel / intel-xpu-backend-for-triton

OpenAI Triton backend for Intel® GPUs

MLIR 167 52 Updated Mar 10, 2025

zhang-wangz / LeetCodeRating

一款对应力扣的浏览器油猴插件| TamperMonkey | Chrome

SCSS 934 30 Updated Mar 10, 2025

OpenGithubs / weekly

Github精选开源项目周刊,每周一更新

956 42 Updated Mar 10, 2025

kvcache-ai / Mooncake

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 2,782 171 Updated Mar 7, 2025

wdndev / llm_interview_note

主要记录大语言大模型（LLMs）算法（应用）工程师相关的知识及面试题

HTML 5,947 692 Updated Oct 22, 2024

ml-tooling / best-of-ml-python

🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.

19,782 2,740 Updated Mar 8, 2025

langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…

TypeScript 80,347 11,757 Updated Mar 10, 2025

agno-agi / agno

Build Multimodal AI Agents with memory, knowledge and tools. Simple, fast and model-agnostic.

Python 20,404 2,716 Updated Mar 10, 2025

facebookresearch / xformers

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 9,155 650 Updated Mar 9, 2025

microsoft / LLMLingua

[EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.

Python 4,926 278 Updated Jan 26, 2025

JadyXuan / NTTS

NO TIME TO SLEEP

Python 644 25 Updated May 26, 2024

fastapi / fastapi

FastAPI framework, high performance, easy to learn, fast to code, ready for production

Python 81,835 7,062 Updated Mar 10, 2025

pytorch / PiPPy

Pipeline Parallelism for PyTorch

Python 754 88 Updated Aug 21, 2024

huggingface / accelerate

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

Python 8,450 1,046 Updated Mar 10, 2025