Depth2World

yueli Depth2World

PhD Candidate, Interested in all of 3D. Intern at Huawei, Autonomous driving

9 followers · 14 following

University of Science and Technology of China

Achievements

Stars

VLM-M

VLM for multi-images

9 repositories

salesforce / LAVIS

LAVIS - A One-stop Library for Language-Vision Intelligence

Jupyter Notebook 10,257 1,002 Updated Nov 18, 2024

TIGER-AI-Lab / Mantis

Official code for Paper "Mantis: Multi-Image Instruction Tuning" [TMLR2024]

Python 199 17 Updated Feb 17, 2025

haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 21,512 2,364 Updated Aug 12, 2024

QwenLM / Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Python 5,490 419 Updated Aug 7, 2024

meta-llama / llama

Inference code for Llama models

Python 57,667 9,708 Updated Jan 26, 2025

LLaVA-VL / LLaVA-NeXT

Python 3,424 313 Updated Feb 13, 2025

NVlabs / VILA

VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.

Python 2,929 237 Updated Feb 10, 2025

PhoenixZ810 / MG-LLaVA

Official repository for paper MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning(https://arxiv.org/abs/2406.17770).

Python 151 4 Updated Sep 27, 2024

OpenGVLab / InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 7,062 538 Updated Dec 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

yueli Depth2World

Achievements

Achievements

Block or report Depth2World

VLM-M

salesforce / LAVIS

TIGER-AI-Lab / Mantis

haotian-liu / LLaVA

QwenLM / Qwen-VL

meta-llama / llama

LLaVA-VL / LLaVA-NeXT

NVlabs / VILA

PhoenixZ810 / MG-LLaVA

OpenGVLab / InternVL