-
Tsinghua University
- Beijing
Lists (26)
Sort Last updated
Cross View Match
Multi Modal Sensors
3D Reconstruction
Image Matching
Foundation Model
Visual Place Recognition
Event Camera
Anything about the D(ynamic) V(ision) S(ensor)Continue Learning
Navigation
Dataset
Satellite Imagery Comprehension
Deep Learning Methods for Satellite Imagery Comprehension and ProcessingNeural Map
Generation Model
UAV Localization
UAV Autonomous Geo-Localization in Unkown EnvironmentSNN
Competition
Lesson
Daily Life
Hardware Setting
Deep Learning Tips
Tips for Deployment of Deep Learning ModelScene Coordinates Regression
Useful Utility
Some useful utilities for better research and learningUncertanty
VIO
Equivariant Networks
Deep Dense Image Alignment
Deep Learning based Dense Image Alignment MethodsStarred repositories
Official implementation of "ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis"
Code for "MatchAnything: Universal Cross-Modality Image Matching with Large-Scale Pre-Training", Arxiv 2025.
Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. C…
Statewide Visual Geolocalization in the Wild (ECCV 2024)
[ECCV'24 Oral] Anytime Continual Learning for Open Vocabulary Classification
CityWalker: Learning Embodied Urban Navigation from Web-Scale Videos
M2DGR: a Multi-modal and Multi-scenario Dataset for Ground Robots(RA-L2021 & ICRA2022)
GeoCalib: Learning Single-image Calibration with Geometric Optimization (ECCV 2024)
🌟A curated list of DUSt3R-related papers and resources, tracking recent advancements using this geometric foundation model.
Real-time dense scene reconstruction with SLAM3R
[ECCV 2024] About The official implementation of the paper "Cross-view image geo-localization with Panorama-BEV Co-Retrieval Network“.
Implementation of the proposed LVMAE, from the paper, Extending Video Masked Autoencoders to 128 frames, in Pytorch
[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
S-CLIP: Semi-supervised Vision-Language Pre-training using Few Specialist Captions
[Pytorch] Generative retrieval model using semantic IDs from "Recommender Systems with Generative Retrieval"
[🎉IEEE TGRS'24] The official code for paper "CAMP: A Cross-View Geo-Localization Method using Contrastive Attributes Mining and Position-aware Partitioning"
[Arxiv'24] OSMLoc: Single Image-Based Visual Localization in OpenStreetMap with Geometric and Seman- tic Guidances
[IROS 2024] BEVLoc: Cross-View Localization and Matching via Birds-Eye-View Synthesis
Offical implementation of "Scaling Spike-driven Transformer with Efficient Spike Firing Approximation Training" (IEEE T-PAMI2025)
[IEEE JSTARS 2024] CV-Cities: Advancing Cross-view Geo-localization in Global Cities
Source Code for Paper "OrienterNet Visual Localization in 2D Public Maps with Neural Matching"