-
The Chinese University of Hong Kong
- HKSAR, China
- https://x-lai.github.io/
Stars
Project Page For "Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement"
Magic Mirror: ID-Preserved Video Generation in Video Diffusion Transformers
Official Implementation for "Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition"
Official repository for VisionZip (CVPR 2025)
Unified Language-driven Zero-shot Domain Adaptation (CVPR 2024)
Official Codebase of "DiffComplete: Diffusion-based Generative 3D Shape Completion"
MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
[CVPR'25 highlight] RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness
Reference implementation for DPO (Direct Preference Optimization)
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models
Robust recipes to align language models with human and AI preferences
Controllable video and image Generation, SVD, Animate Anyone, ControlNet, ControlNeXt, LoRA
Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
The official source code for "X-Ray: A Sequential 3D Representation for Generation".
A PyTorch native library for large model training
Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.
Code examples and resources for DBRX, a large language model developed by Databricks
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
[CVPR 2024] GroupContrast: Semantic-aware Self-supervised Representation Learning for 3D Understanding