Skip to content
View longzw1997's full-sized avatar

Block or report longzw1997

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM

Python 286 19 Updated Jan 2, 2025

Next-Token Prediction is All You Need

Python 2,028 78 Updated Oct 24, 2024

[ICLR 2025] Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.

Python 1,251 55 Updated Mar 12, 2025

📖 This is a repository for organizing papers, codes and other resources related to unified multimodal models.

397 16 Updated Jan 18, 2025

[CSUR] A Survey on Video Diffusion Models

2,016 104 Updated Dec 9, 2024

✨✨VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Python 2,150 164 Updated Feb 13, 2025

✨✨[CVPR 2025] Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

480 20 Updated Dec 14, 2024
Python 34 2 Updated Jul 9, 2024

Easily compute clip embeddings and build a clip retrieval system with them

Jupyter Notebook 2,506 220 Updated Apr 15, 2024

✨✨Woodpecker: Hallucination Correction for Multimodal Large Language Models

Python 632 31 Updated Dec 23, 2024

[CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception

Python 552 42 Updated May 8, 2024

✨✨Latest Advances on Multimodal Large Language Models

14,227 918 Updated Mar 5, 2025

LLM hallucination paper list

309 22 Updated Mar 11, 2024

This is the third party implementation of the paper Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection.

Python 546 98 Updated Jun 25, 2024

The pure and clear PyTorch Distributed Training Framework.

Python 276 56 Updated Jan 24, 2024

Train robotic agents to learn to plan pushing and grasping actions for manipulation with deep reinforcement learning.

Python 1 Updated Feb 4, 2020