Stars
VMamba: Visual State Space Models,code is based on mamba
LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture
CLIP-Mamba: CLIP Pretrained Mamba Models with OOD and Hessian Evaluation
A Large Language-Vision Assistant for Pathology Image Understanding (BIBM-2024)
Official Repository of Personalized Visual Instruct Tuning
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.