Skip to content

The goal of this project is to use recent advances in monocular depth, instance segmentation, scene graphs, and graph neural networks to push the state-of-the-art in indoor image retrieval.

Notifications You must be signed in to change notification settings

rupalsaxena/Monograph-Image-Retrieval

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Monograph-Image-Retrieval

The goal of this project was to use recent advances in monocular depth, instance segmentation, scene graphs, and graph neural networks to push the state-of-the-art in indoor image retrieval.

Setup environment in euler cluster of ETHZ

Clone this public repo:

git clone https://github.com/rupalsaxena/Monograph-Image-Retrieval.git

TODO: add virtual env in package, how to run virtual environment, ask for space using slurm or something and then run the project

Pipeline overview

Shown below is the pipeline overview of this project.

Alt text

We will see how to run each element of this pipeline one by one.

Hypersim Data

The Hypersim dataset, developed by Apple Inc., consists of a collection of photorealistic synthetic images depicting indoor scenes, accompanied by per-pixel ground truth labels. This data consists of RGB, Semantic, and Depth instances.

Download the data from the link: HypersimData

RGB to Semantic Instances

STEP 1: Save rgb and groundtruth semantic instances in torch format using dataloader.

# navigate to hypersim dataloader
cd src/Monograph/dataloader/hypersim_pytorch

In your IDE, update the config parameters, specially PURPOSE="Semantic" Run the following commands to save hypersim data in pytorch format:

# save hypersim data in torch format
python3 save_hypersim_dataset.py

Data will be saved in your provided output path in torch format.

STEP 2: Train DeepLabv3 Resnet50 Model using transfer learning.

# navigate to rgb-to-semantics directory
cd ../../rgb-to-semantics 

Update input path and trained model output path in config.py Train the model

python3 transfer_learning.py

Once training is over, check the provided output path to see trained model.

STEP 3: Save predicted semantic data. Update MODELPATH TESTDATAPATH, main_path in test_model.py file. Finally, run the file to get predicted semantic instances.

python3 test_model.py

Check the predicted semantic images in main_path provided in config.py file.

RGB to Depth

Predict Depth from Rgb using AdelaiDepth pretrained network.

navigate to rgb-to-depth directory:

cd ../rgb-to-depth 

Update image_dir and image_dir_out folders in test_depth.py before running it. Once updated, run the following:

python3 test_depth.py 

Predicted depth data should be available on image_dir_out path mentioned above.

Generate Scene Graph


Train GCN network with Triplet loss using Generated Scene Graphs


Proximity Matching


About

The goal of this project is to use recent advances in monocular depth, instance segmentation, scene graphs, and graph neural networks to push the state-of-the-art in indoor image retrieval.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published