Skip to content
View xuhaitao's full-sized avatar

Block or report xuhaitao

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

FlashInfer: Kernel Library for LLM Serving

Cuda 2,013 207 Updated Feb 14, 2025

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 2,538 150 Updated Feb 14, 2025
Python 15 3 Updated Nov 27, 2024

xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism

Python 1,241 102 Updated Feb 10, 2025

VideoSys: An easy and efficient system for video generation

Python 1,916 130 Updated Jan 1, 2025

This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. RAG systems combine information retrieval with generative models to provide accurate and cont…

Jupyter Notebook 11,958 1,220 Updated Feb 2, 2025

Efficient Triton Kernels for LLM Training

Python 4,412 266 Updated Feb 14, 2025

Baidu's open-source Sentiment Analysis System.

Python 1,942 369 Updated Aug 20, 2024

HuixiangDou: Overcoming Group Chat Scenarios with LLM-based Technical Assistance

Python 2,093 158 Updated Feb 14, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 9,593 911 Updated Feb 14, 2025

A modular graph-based Retrieval-Augmented Generation (RAG) system

Python 22,404 2,233 Updated Feb 14, 2025

[NeurIPS'24 Spotlight, ICLR'25] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filling on an …

Python 908 45 Updated Feb 12, 2025

RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry

Python 3,583 299 Updated Feb 13, 2025

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 40,460 4,967 Updated Feb 14, 2025

⚡️SwanLab - an open-source, modern-design AI training tracking and visualization tool. Supports Cloud / Self-hosted use. Integrated with PyTorch / Transformers / LLaMA Factory / Ultralytics / veRL …

Python 783 60 Updated Feb 13, 2025

3D Visualization of an GPT-style LLM

TypeScript 4,414 490 Updated Aug 24, 2024

[ACM MM 2024] This is the official code for "AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding"

Jupyter Notebook 1,531 141 Updated Aug 15, 2024

📺 An End-to-End Solution for High-Resolution and Long Video Generation Based on Transformer Diffusion

Python 1,952 151 Updated Feb 13, 2025
Python 10,106 1,304 Updated Feb 1, 2025

SD.Next: All-in-one for AI generative image

Python 6,008 456 Updated Feb 14, 2025

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 39,772 5,287 Updated Feb 14, 2025

Stable Diffusion web UI

Python 147,777 27,615 Updated Feb 10, 2025

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.

Python 67,079 7,193 Updated Feb 14, 2025

OneDiff: An out-of-the-box acceleration library for diffusion models.

Jupyter Notebook 1,808 122 Updated Jan 13, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 37,828 5,682 Updated Feb 14, 2025

Use PEFT or Full-parameter to finetune 450+ LLMs (Qwen2.5, InternLM3, GLM4, Llama3.3, Mistral, Yi1.5, Baichuan2, DeepSeek-R1, ...) and 150+ MLLMs (Qwen2.5-VL, Qwen2-Audio, Llama3.2-Vision, Llava, I…

Python 5,441 473 Updated Feb 14, 2025

Start building LLM-empowered multi-agent applications in an easier way.

Python 6,164 365 Updated Feb 14, 2025

An awesome & curated list of best LLMOps tools for developers

Shell 4,391 423 Updated Feb 11, 2025

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with …

Python 6,330 520 Updated Feb 13, 2025

RayLLM - LLMs on Ray

Python 1,254 93 Updated May 28, 2024
Next