Paper: Depth as attention to learn image representations for visual localization, using monocular images https://doi.org/10.1016/j.jvcir.2023.104012
- Depth image rendering
- Using data_prep/gen_depthmaps.py
- Zoedepth is available on torch.hub, So you can either directly use it or use via a local repository
- Refer to https://github.com/isl-org/ZoeDepth , for any zoedepth loading issues
- Training
- train.py
- Configurations defined in config/train.ini
- Provide command line arguments accordingly specially dataset_root_dir
- Training supports both Cambridge Landmarks and Mapillary Street-level Sequences datasets
- Supports netvlad, max, gem descriptors/pooling.
- Evaluation
- xx_retrieval.py perform the retrieval and create the result file
- xx_pos_eval.py gives recall@k and avg.pos.err @k results
Significant parts of the code are based on following repositories https://github.com/mapillary/mapillary_sls https://github.com/QVPR/Patch-NetVLAD