Stars
lingorX / BEV-Scene-Graph
Forked from DefaultRui/BEV-Scene-Graph[ICCV23] Bird’s-Eye-View Scene Graph for Vision-Language Navigation
[ICCV23] Bird’s-Eye-View Scene Graph for Vision-Language Navigation
OMG-LLaVA and OMG-Seg codebase [CVPR-24 and NeurIPS-24]
(ICCV23 Oral) LOGICSEG: Parsing Visual Semantics with Neural Logic Learning and Reasoning
"From ViT Features to Training-free Video Object Segmentation via Streaming-data Mixture Models" [Uziel, Dinari, and Freifeld, NeurIPS 2023]
[NeurIPS2023] Code release for "Hierarchical Open-vocabulary Universal Image Segmentation"
GUI for ChatGPT API and many LLMs. Supports agents, file-based QA, GPT finetuning and query with web search. All with a neat UI.
Official code for VisProg (CVPR 2023 Best Paper!)
A modular high-level library to train embodied AI agents across a variety of tasks and environments.
QLoRA: Efficient Finetuning of Quantized LLMs
(CVPR 2020) Block-wisely Supervised Neural Architecture Search with Knowledge Distillation
(ICCV 2021) BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search
“Population-based Cooperative Gaming for Unsupervised Person Re-identification”, IJCV 2023.
"Multiple Expert Brainstorming for Domain Adaptive Person Re-identification", ECCV 2020
Repository of our CVPR2023 paper "Lana: A Language-Capable Navigator for Instruction Following and Generation"
Visual Dependency Transformers: Dependency Tree Emerges from Reversed Attention (CVPR 2023)
Implementation of "Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents"
Official code for CVPR2023 Boosting Video Object Segmentation via Space-time Correspondence Learning
Edit anything in images powered by segment-anything, ControlNet, StableDiffusion, etc. (ACM MM)
An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) fo…
[TPAMI 2024] Official repo of "ETPNav: Evolving Topological Planning for Vision-Language Navigation in Continuous Environments"
A Robust, Real-time, RGB-colored, LiDAR-Inertial-Visual tightly-coupled state Estimation and mapping package
ChatGPT 中文调教指南。各种场景使用指南。学习怎么让它听你的话。
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
PyTorch Explain: Interpretable Deep Learning in Python.
Series of work (ECCV2020, CVPR2021, CVPR2021, ECCV2022) about Compositional Learning for Human-Object Interaction Exploration
[NeurIPS 2022 Spotlight] GMMSeg: Gaussian Mixture based Generative Semantic Segmentation Models
[NeurIPS 2022 Spotlight] Learning Equivariant Segmentation with Instance-Unique Querying