Lists (2)
Sort Newest
Stars
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
openpilot is an operating system for robotics. Currently, it upgrades the driver assistance system on 275+ supported cars.
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, and more.
A curated list of awesome Deep Stereo Matching resources
The official repo for [IJCAI'24] "LeMeViT: Efficient Vision Transformer with Learnable Meta Tokens for Remote Sensing Image Interpretation"
[CVPR 2024 Highlight] Official PyTorch implementation of SpatialTracker: Tracking Any 2D Pixels in 3D Space
A series of basic algorithms that are useful for video understanding, including Single Object Tracking (SOT), Video Object Segmentation (VOS) and so on.
Escaping the Big Data Paradigm with Compact Transformers, 2021 (Train your Vision Transformers in 30 mins on CIFAR-10 with a single GPU!)
A 3D vision library from 2D keypoints: monocular and stereo 3D detection for humans, social distancing, and body orientation.
Let us democratise high-resolution generation! (CVPR 2024)
CoTracker is a model for tracking any point (pixel) on a video.
fabio-sim / LightGlue-ONNX
Forked from cvg/LightGlueONNX-compatible LightGlue: Local Feature Matching at Light Speed. Supports TensorRT, OpenVINO
LightGlue: Local Feature Matching at Light Speed (ICCV 2023)
Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
500 AI Machine learning Deep learning Computer vision NLP Projects with code