Skip to content
View XuminGaoGithub's full-sized avatar

Block or report XuminGaoGithub

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A curated publication list on open vocabulary semantic segmentation and related area (e.g. zero-shot semantic segmentation) resources..

542 25 Updated Dec 16, 2024

Meta-AI SAM + AMG + Average of Patch-Embedding per instance segment + Clustering = Semantic Segmentation

Jupyter Notebook 4 Updated Nov 26, 2023

A low-cost AI powered robotic arm assistant that listens to your voice commands and can carry out a variety of tabletop tasks.

Python 56 6 Updated May 31, 2024
Python 54 1 Updated Nov 16, 2024

[AAAI 2023] Exploring CLIP for Assessing the Look and Feel of Images

Python 370 20 Updated Oct 27, 2023

👁️ 🖼️ 🔥PyTorch Toolbox for Image Quality Assessment, including PSNR, SSIM, LPIPS, FID, NIQE, NRQM(Ma), MUSIQ, TOPIQ, NIMA, DBCNN, BRISQUE, PI and more...

Python 2,237 187 Updated Feb 12, 2025
Jupyter Notebook 3 Updated Oct 9, 2024

KonIQ-10k Deep Learning Models

Jupyter Notebook 121 22 Updated Sep 29, 2021

[CVPR2023] Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective

Python 212 12 Updated Jan 10, 2025

Uncertainty estimation for anchor-based deep object detectors.

Python 39 6 Updated Nov 24, 2020

Code for our paper titled: "A Review and Comparative Study on Probabilistic Object Detection in Autonomous Driving"

Python 83 25 Updated Feb 22, 2021

🔥 Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

Python 872 57 Updated Feb 12, 2025

Tactile Sensing and Simulation; Visual Tactile Manipulation; Open Source.

289 30 Updated Feb 10, 2025

A data generation pipeline for creating semi-realistic synthetic multi-object videos with rich annotations such as instance segmentation masks, depth maps, and optical flow.

Jupyter Notebook 2,414 237 Updated Dec 1, 2024

[ACM MM23] CLIP-Count: Towards Text-Guided Zero-Shot Object Counting

Python 101 9 Updated Mar 20, 2024

Includes FSC-147-D and the code for training and testing the CounTX model from the paper Open-world Text-specified Object Counting.

Jupyter Notebook 33 3 Updated Sep 27, 2024

Experiment on combining CLIP with SAM to do open-vocabulary image segmentation.

Jupyter Notebook 355 33 Updated Apr 5, 2023

Crop using CLIP

Jupyter Notebook 337 34 Updated Aug 22, 2022
Jupyter Notebook 4 Updated May 30, 2024

Related papers and codes for vision-based robotic grasping

Python 1,309 249 Updated Jun 22, 2023

StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text

Python 1,503 155 Updated Dec 2, 2024

MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators

Python 1,260 125 Updated Feb 10, 2025

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"

Python 3,240 281 Updated May 4, 2024

[CVPR 2024 Highlight] Official PyTorch implementation of SpatialTracker: Tracking Any 2D Pixels in 3D Space

Python 789 31 Updated Oct 17, 2024
Python 3,416 642 Updated Dec 5, 2023

FILM: Frame Interpolation for Large Motion, In ECCV 2022.

Python 2,916 288 Updated Aug 10, 2024

The Arcade Learning Environment (ALE) -- a platform for AI research.

C++ 2,228 436 Updated Feb 15, 2025
Next