Skip to content
View xemcerk's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report xemcerk

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

The Open-Source Data Annotation Platform

TypeScript 636 52 Updated Jan 17, 2025

A Comprehensive Toolkit for High-Quality PDF Content Extraction

Python 6,443 428 Updated Jan 3, 2025

Code Release of F-LMM: Grounding Frozen Large Multimodal Models

Python 60 1 Updated Aug 5, 2024
TypeScript 39 8 Updated Oct 20, 2023

Evaluation code for Ref-L4, a new REC benchmark in the LMM era

Python 22 Updated Dec 28, 2024

[CVPR 24] The repository provides code for running inference and training for "Segment and Caption Anything" (SCA) , links for downloading the trained model checkpoints, and example notebooks / gra…

Python 209 9 Updated Sep 30, 2024
Python 341 24 Updated Nov 5, 2024

Densely Captioned Images (DCI) dataset repository.

Python 167 5 Updated Jul 1, 2024

🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / Qwen / DeepSeek), Knowledge Base (file upload / knowledge managemen…

TypeScript 51,826 11,221 Updated Jan 20, 2025

📚 A collection of papers about Referring Image Segmentation.

660 58 Updated Nov 11, 2024

[ECCV'24] Official Implementation of SemiVL: Semi-Supervised Semantic Segmentation with Vision-Language Guidance

Python 119 10 Updated Aug 31, 2024

[ECCV2024] This is an official implementation for "PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model"

Python 208 10 Updated Dec 30, 2024

PixelLM is an effective and efficient LMM for pixel-level reasoning and understanding. PixelLM is accepted by CVPR 2024.

Python 195 5 Updated Jun 3, 2024

[NeurIPS 2023] Official implementation of the paper "Segment Everything Everywhere All at Once"

Python 4,459 413 Updated Aug 19, 2024

[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.

Python 815 41 Updated Nov 23, 2024

[NeurIPS 2024] MoVA: Adapting Mixture of Vision Experts to Multimodal Context

Python 141 4 Updated Sep 25, 2024

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 13,709 1,355 Updated Dec 25, 2024

Project Page for "LISA: Reasoning Segmentation via Large Language Model"

Python 1,960 134 Updated Dec 30, 2024

A repository for surgical action triplet dataset. Data are videos of laparoscopic cholecystectomy that have been annotated with <instrument, verb, target> labels for every surgical fine-grained act…

Python 46 5 Updated Aug 7, 2023

Class notes for CS 131.

TeX 753 389 Updated Sep 12, 2022

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Python 9,333 883 Updated Jul 1, 2024

The Student's Guide to @lintool

316 23 Updated Dec 15, 2024

Machine Learning Engineering Open Book

Python 12,434 762 Updated Jan 19, 2025

A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.

3,823 215 Updated Jan 19, 2025

A reading list of video generation

480 34 Updated Jan 20, 2025

Open source codebase powering the HuggingChat app

TypeScript 7,910 1,170 Updated Jan 20, 2025

QA Bot for Hugging Face documentation to accelerate development within the ecosystem.

Python 43 4 Updated May 5, 2024

Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.

Python 286,450 47,746 Updated Dec 2, 2024

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

Python 3,854 348 Updated Aug 7, 2024

A collection of papers on the topic of ``Computer Vision in the Wild (CVinW)''

1,237 58 Updated Mar 14, 2024
Next