Stars
A collection of graph foundation models including papers, codes, and datasets.
5
Updated Apr 13, 2025
LLaVA-CoT, a visual language model capable of spontaneous, systematic reasoning
Mora: More like Sora for Generalist Video Generation
The official GitHub page for the review paper "Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models".
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
Artistic Vision-Language Understanding with Adapter-enhanced MiniGPT-4