UNetFormer: A UNet-like transformer for efficient semantic segmentation of remote sensing urban scene imagery, ISPRS. Also, including other vision transformers and CNNs for satellite, aerial image …

Python 833 124 Updated Aug 19, 2024

lyuwenyu / RT-DETR

[CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. 🔥 🔥 🔥

Python 3,229 381 Updated Feb 14, 2025

if-ai / ComfyUI-IF_AI_tools

ComfyUI-IF_AI_tools is a set of custom nodes for ComfyUI that allows you to generate prompts using a local Large Language Model (LLM) via Ollama. This tool enables you to enhance your image generat…

Python 607 47 Updated Jan 3, 2025

walking-shadow / Official_Remote_Sensing_Mamba

Official code of Remote Sensing Mamba

Python 269 14 Updated Apr 25, 2024

frank-xwang / UnSAM

[NeurIPS 2024] Code release for "Segment Anything without Supervision"

Jupyter Notebook 449 28 Updated Oct 6, 2024

cambrian-mllm / cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,859 128 Updated Oct 30, 2024

siyuanliii / masa

Official Implementation of CVPR24 highligt paper: Matching Anything by Segmenting Anything

Python 1,219 80 Updated Nov 7, 2024

AILab-CVC / SEED-X

Multimodal Models in Real World

Jupyter Notebook 440 20 Updated Feb 24, 2025

mbzuai-oryx / LLaVA-pp

🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)

Python 832 61 Updated Jul 10, 2024

Beckschen / ViTamin

[CVPR 2024] Official implementation of "ViTamin: Designing Scalable Vision Models in the Vision-language Era"

Python 197 7 Updated Jun 9, 2024

FoundationVision / GLEE

[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale

Python 1,096 86 Updated Oct 21, 2024

modelscope / agentscope

Start building LLM-empowered multi-agent applications in an easier way.

Python 6,393 376 Updated Feb 24, 2025

dvlab-research / MGM

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"

Python 3,242 281 Updated May 4, 2024

codefuse-ai / codefuse-chatbot

An intelligent assistant serving the entire software development lifecycle, powered by a Multi-Agent Framework, working with DevOps Toolkits, Code&Doc Repo RAG, etc.

Python 1,129 116 Updated Jul 1, 2024

xinghaochen / TinySAM

[AAAI 2025] Official PyTorch implementation of "TinySAM: Pushing the Envelope for Efficient Segment Anything Model"

Python 445 27 Updated Jan 19, 2025

IDEA-Research / T-Rex

[ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy

Python 2,407 160 Updated Oct 21, 2024

shenyunhang / APE

[CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception

Python 550 41 Updated May 8, 2024

qianqianwang68 / omnimotion

Python 2,192 125 Updated Jun 11, 2024

ZrrSkywalker / Personalize-SAM

Personalize Segment Anything Model (SAM) with 1 shot in 10 seconds

Python 1,559 106 Updated Jul 22, 2024

mbzuai-oryx / groundingLMM

[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.

Python 833 42 Updated Nov 23, 2024

hephaest0s / usbkill

« usbkill » is an anti-forensic kill-switch that waits for a change on your USB ports and then immediately shuts down your computer.

Python 4,501 511 Updated Mar 1, 2024

princeton-vl / infinigen

Infinite Photorealistic Worlds using Procedural Generation

Python 6,263 505 Updated Jan 8, 2025

opengeos / WhiteboxTools-ArcGIS

ArcGIS Python Toolbox for WhiteboxTools

Python 275 66 Updated Nov 11, 2024

ShareQiu1994 / cesium-vue-electron

Cesium development template based on vueCli 4.x.x + and electron 6.x.x +

JavaScript 48 20 Updated Jul 31, 2020

potree / potree

WebGL point cloud viewer for large datasets

JavaScript 4,764 1,217 Updated Aug 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

skylning

Block or report skylning

Stars

om-ai-lab / VLM-R1

magic-research / Sa2VA

fieldsoftheworld / ftw-baselines

likyoo / SegEarth-OV

cgohlke / geospatial-wheels

WangLibo1995 / GeoSeg