Skip to content
View longzw1997's full-sized avatar

Block or report longzw1997

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
15 results for source starred repositories
Clear filter

✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM

Python 287 19 Updated Jan 2, 2025

Next-Token Prediction is All You Need

Python 2,029 78 Updated Oct 24, 2024

[ICLR 2025] Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.

Python 1,252 55 Updated Mar 12, 2025

📖 This is a repository for organizing papers, codes and other resources related to unified multimodal models.

399 16 Updated Jan 18, 2025

[CSUR] A Survey on Video Diffusion Models

2,016 104 Updated Dec 9, 2024

✨✨VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Python 2,152 164 Updated Feb 13, 2025

✨✨[CVPR 2025] Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

481 20 Updated Dec 14, 2024
Python 35 2 Updated Jul 9, 2024

Easily compute clip embeddings and build a clip retrieval system with them

Jupyter Notebook 2,507 220 Updated Apr 15, 2024

✨✨Woodpecker: Hallucination Correction for Multimodal Large Language Models

Python 632 31 Updated Dec 23, 2024

[CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception

Python 552 42 Updated May 8, 2024

✨✨Latest Advances on Multimodal Large Language Models

14,236 918 Updated Mar 5, 2025

LLM hallucination paper list

309 22 Updated Mar 11, 2024

This is the third party implementation of the paper Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection.

Python 546 98 Updated Jun 25, 2024

The pure and clear PyTorch Distributed Training Framework.

Python 276 56 Updated Jan 24, 2024