Skip to content
View skylning's full-sized avatar

Block or report skylning

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Solve Visual Understanding with Reinforced VLMs

Python 3,644 218 Updated Feb 28, 2025

🔥 Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

Python 931 59 Updated Feb 25, 2025

Code for running baseline models/experiments with the Fields of The World dataset

Jupyter Notebook 74 6 Updated Dec 18, 2024

[CVPR 2025] SegEarth-OV: Towards Training-Free Open-Vocabulary Segmentation for Remote Sensing Images

Python 74 1 Updated Dec 13, 2024

Geospatial library wheels for Python on Windows.

606 54 Updated Jan 19, 2025

UNetFormer: A UNet-like transformer for efficient semantic segmentation of remote sensing urban scene imagery, ISPRS. Also, including other vision transformers and CNNs for satellite, aerial image …

Python 833 124 Updated Aug 19, 2024

[CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. 🔥 🔥 🔥

Python 3,229 381 Updated Feb 14, 2025

ComfyUI-IF_AI_tools is a set of custom nodes for ComfyUI that allows you to generate prompts using a local Large Language Model (LLM) via Ollama. This tool enables you to enhance your image generat…

Python 607 47 Updated Jan 3, 2025

Official code of Remote Sensing Mamba

Python 269 14 Updated Apr 25, 2024

[NeurIPS 2024] Code release for "Segment Anything without Supervision"

Jupyter Notebook 449 28 Updated Oct 6, 2024

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,859 128 Updated Oct 30, 2024

Official Implementation of CVPR24 highligt paper: Matching Anything by Segmenting Anything

Python 1,219 80 Updated Nov 7, 2024

Multimodal Models in Real World

Jupyter Notebook 440 20 Updated Feb 24, 2025

🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)

Python 832 61 Updated Jul 10, 2024

[CVPR 2024] Official implementation of "ViTamin: Designing Scalable Vision Models in the Vision-language Era"

Python 197 7 Updated Jun 9, 2024

[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale

Python 1,096 86 Updated Oct 21, 2024

Start building LLM-empowered multi-agent applications in an easier way.

Python 6,393 376 Updated Feb 24, 2025

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"

Python 3,242 281 Updated May 4, 2024

An intelligent assistant serving the entire software development lifecycle, powered by a Multi-Agent Framework, working with DevOps Toolkits, Code&Doc Repo RAG, etc.

Python 1,129 116 Updated Jul 1, 2024

[AAAI 2025] Official PyTorch implementation of "TinySAM: Pushing the Envelope for Efficient Segment Anything Model"

Python 445 27 Updated Jan 19, 2025

[ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy

Python 2,407 160 Updated Oct 21, 2024

[CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception

Python 550 41 Updated May 8, 2024

Personalize Segment Anything Model (SAM) with 1 shot in 10 seconds

Python 1,559 106 Updated Jul 22, 2024

[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.

Python 833 42 Updated Nov 23, 2024

« usbkill » is an anti-forensic kill-switch that waits for a change on your USB ports and then immediately shuts down your computer.

Python 4,501 511 Updated Mar 1, 2024

Infinite Photorealistic Worlds using Procedural Generation

Python 6,263 505 Updated Jan 8, 2025

ArcGIS Python Toolbox for WhiteboxTools

Python 275 66 Updated Nov 11, 2024

Cesium development template based on vueCli 4.x.x + and electron 6.x.x +

JavaScript 48 20 Updated Jul 31, 2020

WebGL point cloud viewer for large datasets

JavaScript 4,764 1,217 Updated Aug 24, 2024
Next
Showing results