Stars
[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-sim…
High-resolution models for human tasks.
Refine high-quality datasets and visual AI models
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
Effortless data labeling with AI support from Segment Anything and other awesome models.
YOLOv10: Real-Time End-to-End Object Detection [NeurIPS 2024]
Accepted as [NeurIPS 2024] Spotlight Presentation Paper
A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。
Implementation of paper - YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information
[CVPR 2024] Real-Time Open-Vocabulary Object Detection
This repository has been moved. The new location is in https://github.com/TexasInstruments/edgeai-tensorlab
Official implementation of `Splatter Image: Ultra-Fast Single-View 3D Reconstruction' CVPR 2024
A Unreal Engine 5 (UE5) based plugin aiming to provide real-time visulization, management, editing, and scalable hybrid rendering of Guassian Splatting model.
Generative Models by Stability AI
Graphic notes on Gilbert Strang's "Linear Algebra for Everyone"
Official Code for DragGAN (SIGGRAPH 2023)
LightGlue: Local Feature Matching at Light Speed (ICCV 2023)
Object detection, 3D detection, and pose estimation using center point detection:
[CVPR 2023 Highlight] InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
StableLM: Stability AI Language Models
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
CV-CUDA™ is an open-source, GPU accelerated library for cloud-scale image processing and computer vision.