Stars
HunyuanVideo-I2V: A Customizable Image-to-Video Model based on HunyuanVideo
Finetune Llama 3.3, DeepSeek-R1, Gemma 3 & Reasoning LLMs 2x faster with 70% less memory! 🦥
A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
Wan: Open and Advanced Large-Scale Video Generative Models
ComfyUI-Manager is an extension designed to enhance the usability of ComfyUI. It offers management functions to install, remove, disable, and enable various custom nodes of ComfyUI. Furthermore, th…
DeepEP: an efficient expert-parallel communication library
A simple screen parsing tool towards pure vision based GUI agent
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
Official implementation of "Sonic: Shifting Focus to Global Audio Perception in Portrait Animation"
InspireMusic: A Unified Framework for Music, Song, Audio Generation.
🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022
ComfyUI's ControlNet Auxiliary Preprocessors
High-Resolution 3D Assets Generation with Large Scale Hunyuan3D Diffusion Models.
Industry leading face manipulation platform
SPAR3D: Stable Point-Aware Reconstruction of 3D Objects from Single Images
Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. C…
A third-party ChatGPT Web UI page built with Express and Vue3, through the official OpenAI completion API. / 用 Express 和 Vue3 搭建的第三方 ChatGPT 前端页面, 基于 OpenAI 官方 completion API.
Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation" (CVPR'25).
ComfyUI TRELLIS is a large 3D asset generation in various formats, such as Radiance Fields, 3D Gaussians, and meshes. The cornerstone of TRELLIS is a unified Structured LATent (SLAT) representation…
Redux StyleModelApply adds more controls
In order to make it easier to use the ComfyUI, I have made some optimizations and integrations to some commonly used nodes.
OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
羊了个羊 + 深度强化学习(Deep Reinforcement Learning + 3 Tiles Game)
[ICLR'24] Matcher: Segment Anything with One Shot Using All-Purpose Feature Matching