-
Tsinghua University
- Beijing
Lists (26)
Sort Name ascending (A-Z)
3D Reconstruction
Competition
Continue Learning
Cross View Match
Daily Life
Dataset
Deep Dense Image Alignment
Deep Learning based Dense Image Alignment MethodsDeep Learning Tips
Tips for Deployment of Deep Learning ModelEquivariant Networks
Event Camera
Anything about the D(ynamic) V(ision) S(ensor)Foundation Model
Generation Model
Hardware Setting
Image Matching
Lesson
Multi Modal Sensors
Navigation
Neural Map
Satellite Imagery Comprehension
Deep Learning Methods for Satellite Imagery Comprehension and ProcessingScene Coordinates Regression
SNN
UAV Localization
UAV Autonomous Geo-Localization in Unkown EnvironmentUncertanty
Useful Utility
Some useful utilities for better research and learningVIO
Visual Place Recognition
Starred repositories
Official implementation of "ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis"
Code for "MatchAnything: Universal Cross-Modality Image Matching with Large-Scale Pre-Training", Arxiv 2025.
Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. C…
Statewide Visual Geolocalization in the Wild (ECCV 2024)
[ECCV'24 Oral] Anytime Continual Learning for Open Vocabulary Classification
CityWalker: Learning Embodied Urban Navigation from Web-Scale Videos
M2DGR: a Multi-modal and Multi-scenario Dataset for Ground Robots(RA-L2021 & ICRA2022)
GeoCalib: Learning Single-image Calibration with Geometric Optimization (ECCV 2024)
🌟A curated list of DUSt3R-related papers and resources, tracking recent advancements using this geometric foundation model.
Real-time dense scene reconstruction with SLAM3R
[ECCV 2024] About The official implementation of the paper "Cross-view image geo-localization with Panorama-BEV Co-Retrieval Network“.
Implementation of the proposed LVMAE, from the paper, Extending Video Masked Autoencoders to 128 frames, in Pytorch
[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
S-CLIP: Semi-supervised Vision-Language Pre-training using Few Specialist Captions
[Pytorch] Generative retrieval model using semantic IDs from "Recommender Systems with Generative Retrieval"
[🎉IEEE TGRS'24] The official code for paper "CAMP: A Cross-View Geo-Localization Method using Contrastive Attributes Mining and Position-aware Partitioning"
[Arxiv'24] OSMLoc: Single Image-Based Visual Localization in OpenStreetMap with Geometric and Seman- tic Guidances
[IROS 2024] BEVLoc: Cross-View Localization and Matching via Birds-Eye-View Synthesis
Offical implementation of "Scaling Spike-driven Transformer with Efficient Spike Firing Approximation Training" (IEEE T-PAMI2025)
[IEEE JSTARS 2024] CV-Cities: Advancing Cross-view Geo-localization in Global Cities
Source Code for Paper "OrienterNet Visual Localization in 2D Public Maps with Neural Matching"