The goal of this project was to use recent advances in monocular depth, instance segmentation, scene graphs, and graph neural networks to push the state-of-the-art in indoor image retrieval.
Clone this public repo:
git clone https://github.com/rupalsaxena/Monograph-Image-Retrieval.git
TODO: add virtual env in package, how to run virtual environment, ask for space using slurm or something and then run the project
Shown below is the pipeline overview of this project.
We will see how to run each element of this pipeline one by one.
The Hypersim dataset, developed by Apple Inc., consists of a collection of photorealistic synthetic images depicting indoor scenes, accompanied by per-pixel ground truth labels. This data consists of RGB, Semantic, and Depth instances.
Download the data from the link: HypersimData
STEP 1: Save rgb and groundtruth semantic instances in torch format using dataloader.
# navigate to hypersim dataloader
cd src/Monograph/dataloader/hypersim_pytorch
In your IDE, update the config parameters, specially PURPOSE="Semantic" Run the following commands to save hypersim data in pytorch format:
# save hypersim data in torch format
python3 save_hypersim_dataset.py
Data will be saved in your provided output path in torch format.
STEP 2: Train DeepLabv3 Resnet50 Model using transfer learning.
# navigate to rgb-to-semantics directory
cd ../../rgb-to-semantics
Update input path and trained model output path in config.py Train the model
python3 transfer_learning.py
Once training is over, check the provided output path to see trained model.
STEP 3: Save predicted semantic data. Update MODELPATH TESTDATAPATH, main_path in test_model.py file. Finally, run the file to get predicted semantic instances.
python3 test_model.py
Check the predicted semantic images in main_path provided in config.py file.
Predict Depth from Rgb using AdelaiDepth pretrained network.
navigate to rgb-to-depth directory:
cd ../rgb-to-depth
Update image_dir and image_dir_out folders in test_depth.py before running it. Once updated, run the following:
python3 test_depth.py
Predicted depth data should be available on image_dir_out path mentioned above.