Germany321

Xiongkun Linghu Germany321

Long-term goal: building powerful, reliable and safe embodied generalist agents in physical world.

9 followers · 31 following

Beijing Institute for General Artificial Intelligence
Beijing
https://xiongkunlinghu.github.io/

Achievements

Highlights

Lists (1)

Sort

🚀 My stack

Beta Lists are currently in beta. Share feedback and report bugs.

Starred repositories

HKUST-LongGroup / Awesome-MLLM-Benchmarks

45 4 Updated Oct 8, 2024

sg-3d / sg3d

Python 40 1 Updated Oct 3, 2024

ashkamath / mdetr

Python 968 125 Updated Oct 3, 2022

UX-Decoder / LLaVA-Grounding

Python 342 13 Updated Jul 29, 2024

chuanyangjin / MMToM-QA

[🏆Outstanding Paper Award at ACL 2024] MMToM-QA: Multimodal Theory of Mind Question Answering

Python 119 14 Updated Sep 8, 2024

salesforce / LAVIS

LAVIS - A One-stop Library for Language-Vision Intelligence

Jupyter Notebook 9,752 955 Updated Oct 11, 2024

BAAI-DCAI / SpatialBot

The official repo for "SpatialBot: Precise Spatial Understanding with Vision Language Models.

Python 147 9 Updated Sep 19, 2024

IDEA-Research / GroundingDINO

[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"

Python 6,504 667 Updated Aug 12, 2024

IDEA-Research / Grounding-DINO-1.5-API

API for Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series

Python 740 22 Updated Aug 9, 2024

apple / ml-ferret

Python 8,373 491 Updated Oct 9, 2024

Hannibal046 / Awesome-LLM

Awesome-LLM: a curated list of Large Language Model

18,195 1,464 Updated Oct 7, 2024

pipilurj / bootstrapped-preference-optimization-BPO

code for "Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization"

Python 43 1 Updated Aug 23, 2024

Krasjet / quaternion

A brief introduction to the quaternions and its applications in 3D geometry.

HTML 1,721 273 Updated Sep 9, 2021

ZrrSkywalker / MAVIS

Mathematical Visual Instruction Tuning for Multi-modal Large Language Models

101 1 Updated Aug 5, 2024

OpenRobotLab / GRUtopia

GRUtopia: Dream General Robots in a City at Scale

Python 482 23 Updated Sep 5, 2024

UMass-Foundation-Model / 3D-VLA

[ICML 2024] 3D-VLA: A 3D Vision-Language-Action Generative World Model

Python 326 13 Updated Oct 7, 2024

MIT-SPARK / Hydra

C++ 598 72 Updated Oct 11, 2024

cambrian-mllm / cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,723 113 Updated Sep 19, 2024

penghao-wu / vstar

PyTorch Implementation of "V* : Guided Visual Search as a Core Mechanism in Multimodal LLMs"

Python 512 33 Updated Jan 7, 2024

liudaizong / Awesome-3D-Visual-Grounding

😎 up-to-date & curated list of awesome 3D Visual Grounding papers, methods & resources.

66 1 Updated Oct 2, 2024

RL4VLM / RL4VLM

Official Repo for Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning

Jupyter Notebook 187 19 Updated Sep 26, 2024

nerfies / nerfies.github.io

JavaScript 2,426 858 Updated Jun 21, 2024

KindXiaoming / pykan

Kolmogorov Arnold Networks

Jupyter Notebook 14,813 1,356 Updated Sep 15, 2024

facebookresearch / open-eqa

OpenEQA Embodied Question Answering in the Era of Foundation Models

Jupyter Notebook 212 20 Updated Sep 20, 2024

OpenGVLab / LLaMA-Adapter

[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters

Python 5,717 370 Updated Mar 14, 2024

OpenGVLab / InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 5,721 447 Updated Sep 19, 2024

tsb0601 / MMVP

Python 281 7 Updated Jan 27, 2024

MJ10 / BioSeq-GFN-AL

Code for "Biological Sequence Design with GFlowNets", 2022

Python 70 16 Updated Mar 17, 2023

allenai / unified-io-2

Python 563 27 Updated Feb 15, 2024

ActiveVisionLab / Awesome-LLM-3D

Awesome-LLM-3D: a curated list of Multi-modal Large Language Model in 3D world Resources

1,055 73 Updated Oct 8, 2024