-
Karlsruhe Institute of Technology (KIT)
Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Stars
Official Implementation for ICCV'23 paper Coarse-to-Fine Amodal Segmentation with Shape Prior (C2F-Seg).
[CVPR 2024] Official implementation of the paper "Visual In-context Learning"
PyTorch Implementation of "V* : Guided Visual Search as a Core Mechanism in Multimodal LLMs"
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
Collection of 2D Datasets for Anatomy Segmentation
Python scripts to download Assembly101 from Google Drive
MASS: Multi-Attentional Semantic Segmentation ofLiDAR Data for Dense Top-View Understanding
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
Code for OrthDNNs: Orthogonal Deep Neural Networks
Map (deep learning) model weights between different model implementations.
A framework for annotating 3D meshes using the predictions of a 2D semantic segmentation model.
Universal Tensor Operations in Einstein-Inspired Notation for Python.
Statewide Visual Geolocalization in the Wild (ECCV 2024)
The implementation of NoiseEraSAR in "Skeleton-Based Human Action Recognition with Noisy Labels"
[ICCV 2023] RLIPv2: Fast Scaling of Relational Language-Image Pre-training
🎥 Python and OpenCV-based scene cut/transition detection program & library.
VMamba: Visual State Space Models,code is based on mamba
RoDLA: Benchmarking the Robustness of Document Layout Analysis Models
A curated paper list of awesome skeleton-based action recognition.
[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Implementation of Vision Mamba from the paper: "Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model" It's 2.8x faster than DeiT and saves 86.8% GPU memory wh…