CV
YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
Object Detection toolkit based on PaddlePaddle. It supports object detection, instance segmentation, multiple object tracking and real-time multi-person keypoint detection.
[AAAI 2025] Official codes of "ResAdapter: Domain Consistent Resolution Adapter for Diffusion Models".
[CVPR 2025] StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
Stable Diffusion web UI
Retrieval and Retrieval-augmented LLMs
MuseV: Infinite-length and High Fidelity Virtual Human Video Generation with Visual Conditioned Parallel Denoising
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
A PyTorch implementation of the paper "All are Worth Words: A ViT Backbone for Diffusion Models".
FaceChain is a deep-learning toolchain for generating your Digital-Twin.
MusePose: a Pose-Driven Image-to-Video Framework for Virtual Human Generation
[ICML 2024] EvTexture: Event-driven Texture Enhancement for Video Super-Resolution
Bringing Old Photo Back to Life (CVPR 2020 oral)
TransNet V2: Shot Boundary Detection Neural Network
TensorRT Extension for Stable Diffusion Web UI
A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library for Python
[CVPR'25] Official Implementations for Paper - MagicQuill: An Intelligent Interactive Image Editing System
End-to-End Object Detection with Transformers