Lists (2)
Sort Name ascending (A-Z)
Stars
Large Concept Models: Language modeling in a sentence representation space
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.
🔖🔄 | YOLO to VOC format <> VOC to YOLO format | Multiprocessing support
Repository for the paper: Teaching VLMs to Localize Specific Objects from In-context Examples
[Neurips'24 Spotlight] Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought Reasoning
[ECCV' 24] CLIFF: Continual Latent Diffusion for Open-Vocabulary Object Detection
EventHallusion: Diagnosing Event Hallucinations in Video LLMs
[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization
Writing AI Conference Papers: A Handbook for Beginners
[ECCV 2024] Official implementation of "LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction"
Code for ICML 2020 "Graph Optimal Transport for Cross-Domain Alignment"
A curated list of papers and resources related to Described Object Detection, Open-Vocabulary/Open-World Object Detection and Referring Expression Comprehension. Updated frequently and pull request…
This repository is an official implementation of the paper "LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection".
ArtFusion: Controllable Arbitrary Style Transfer using Dual Conditional Latent Diffusion Models
Official Repository of "Unpaired Image-to-Image Translation via Neural Schrödinger Bridge" (ICLR 2024)
21 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/
[ICLR 2023] PyTorch implementation of VLDet (https://arxiv.org/abs/2211.14843)
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Code Release of F-LMM: Grounding Frozen Large Multimodal Models
OVMR: Open-Vocabulary Recognition with Multi-Modal References (CVPR24)
Official implementation of the paper "ProxyDet: Synthesizing Proxy Novel Classes via Classwise Mixup for Open-Vocabulary Object Detection"
Contextual Object Detection with Multimodal Large Language Models
PyTorch Implementation of NACLIP in "Pay Attention to Your Neighbours: Training-Free Open-Vocabulary Semantic Segmentation"
InstaGen: Enhancing Object Detection by Training on Synthetic Dataset, CVPR2024
A curated list of papers, datasets and resources pertaining to open vocabulary object detection.