xemcerk

🎯

Focusing

xemcerk

🎯

Focusing

Now is better than never

10 followers · 8 following

Achievements

Stars

opendatalab / LabelLLM

The Open-Source Data Annotation Platform

TypeScript 636 52 Updated Jan 17, 2025

opendatalab / PDF-Extract-Kit

A Comprehensive Toolkit for High-Quality PDF Content Extraction

Python 6,443 428 Updated Jan 3, 2025

wusize / F-LMM

Code Release of F-LMM: Grounding Frozen Large Multimodal Models

Python 60 1 Updated Aug 5, 2024

Creling / obsidian-image-uploader

TypeScript 39 8 Updated Oct 20, 2023

JierunChen / Ref-L4

Evaluation code for Ref-L4, a new REC benchmark in the LMM era

Python 22 Updated Dec 28, 2024

xk-huang / segment-caption-anything

[CVPR 24] The repository provides code for running inference and training for "Segment and Caption Anything" (SCA) , links for downloading the trained model checkpoints, and example notebooks / gra…

Python 209 9 Updated Sep 30, 2024

Vision-CAIR / LongVU

Python 341 24 Updated Nov 5, 2024

facebookresearch / DCI

Densely Captioned Images (DCI) dataset repository.

Python 167 5 Updated Jul 1, 2024

lobehub / lobe-chat

🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / Qwen / DeepSeek), Knowledge Base (file upload / knowledge managemen…

TypeScript 51,826 11,221 Updated Jan 20, 2025

MarkMoHR / Awesome-Referring-Image-Segmentation

📚 A collection of papers about Referring Image Segmentation.

660 58 Updated Nov 11, 2024

google-research / semivl

[ECCV'24] Official Implementation of SemiVL: Semi-Supervised Semantic Segmentation with Vision-Language Guidance

Python 119 10 Updated Aug 31, 2024

zamling / PSALM

[ECCV2024] This is an official implementation for "PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model"

Python 208 10 Updated Dec 30, 2024

MaverickRen / PixelLM

PixelLM is an effective and efficient LMM for pixel-level reasoning and understanding. PixelLM is accepted by CVPR 2024.

Python 195 5 Updated Jun 3, 2024

UX-Decoder / Segment-Everything-Everywhere-All-At-Once

[NeurIPS 2023] Official implementation of the paper "Segment Everything Everywhere All at Once"

Python 4,459 413 Updated Aug 19, 2024

mbzuai-oryx / groundingLMM

[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.

Python 815 41 Updated Nov 23, 2024

TempleX98 / MoVA

[NeurIPS 2024] MoVA: Adapting Mixture of Vision Experts to Multimodal Context

Python 141 4 Updated Sep 25, 2024

facebookresearch / sam2

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 13,709 1,355 Updated Dec 25, 2024

dvlab-research / LISA

Project Page for "LISA: Reasoning Segmentation via Large Language Model"

Python 1,960 134 Updated Dec 30, 2024

CAMMA-public / cholect50

A repository for surgical action triplet dataset. Data are videos of laparoscopic cholecystectomy that have been annotated with <instrument, verb, target> labels for every surgical fine-grained act…

Python 46 5 Updated Aug 7, 2023