We propose a novel framework to remove transient objects from input videos for 3D scene reconstruction using Gaussian Splatting. Our framework consists of the following steps. In the first step, we propose an unsupervised training strategy for a classification network to distinguish between transient objects and static scene parts based on their different training behavior inside the 3D Gaussian Splatting reconstruction. In the second step, we improve the boundary quality and stability of the detected transients by combining our results from the first step with an off-the-shelf segmentation method. We also propose a simple and effective strategy to track objects in the input video forward and backward in time. Our results show an improvement over the current state of the art in existing sparsely captured datasets and significant improvements in a newly proposed densely captured (video) dataset.
This repository implements Transient Mask Predictor (TMP), a solution for handling transient objects in 3D scene reconstruction. For mask refinement functionality (TMR), please refer to our companion repository.
- Automatic Detection of Transient Objects: Integrate transient object removal seamlessly into the 3D reconstruction pipeline.
- Two-Stage Pipeline: Combines TMP and TMR for enhanced mask prediction and refinement.
- Docker Support: Simplifies deployment and setup across different environments.
The installation process aligns with the original Gaussian Splatting project, with additional dependencies specified in environment.yml
. We also provide a Dockerfile
for containerized setups.
By default, the following features are enabled:
- Transient Mask Prediction (TMP)
- Mask Dilation
- Consistency Loss
- Depth Regularization
To start training with default settings:
python train.py -s [path to dataset]
To disable specific features, use the following flags:
- Disable Transient Mask Predictor (TMP):
python train.py -s [path to dataset] --disable_transient
- Disable Mask Dilation:
python train.py -s [path to dataset] --disable_dilate
- Disable Consistency Loss:
python train.py -s [path to dataset] --disable_consistency
- Disable Depth Regularization:
python train.py -s [path to dataset] --lambda_tv 0
python train.py -s [path to dataset] --masks [path to masks] --disable_transient
- Masks should be in
.png
format. - Masks can have any naming format.
- Images and masks are matched based on their positions in the nasorted lists of image filenames and mask filenames.
- It is recommended to slightly dilate your masks to account for potential inaccuracies. Use the
--mask_dilate
flag (default is 5).
To run all experiments without TMR:
bash examples/tmp_benchmark.sh
This script will initiate the training and evaluation processes for the TMP without mask refinement.
To refine transient masks using TMR, follow these steps:
- Prepare TMR input
Run the preparation script:
bash examples/prepare_tmr_input.sh
This script performs the following actions:
-
Reformats Images: Converts images to the format required by SAM2.
-
Extracts Transient Masks and Differences: Retrieves transient masks and difference images from your T-3DGS checkpoint (default iteration is 7000).
- Run TMR
- Follow Instructions: Visit the TMR Repository for detailed instructions.
- Execute Refinement Script: Use the provided script in the TMR repository to perform mask refinement.
- Final Training with Refined Masks After obtaining refined masks from TMR, run the following script to train the model with these masks:
bash examples/tmr_benchmark.sh
If you find this work useful in your research, please consider citing:
@misc{pryadilshchikov2024t3dgsremovingtransientobjects,
title={T-3DGS: Removing Transient Objects for 3D Scene Reconstruction},
author={Vadim Pryadilshchikov and Alexander Markin and Artem Komarichev and Ruslan Rakhimov and Peter Wonka and Evgeny Burnaev},
year={2024},
eprint={2412.00155},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2412.00155},
}
Our code is based on the official implementation of 3D Gaussian Splatting (3DGS).