Yuzheng Liu* · Siyan Dong* · Shuzhe Wang · Yanchao Yang · Qingnan Fan · Baoquan Chen
SLAM3R is a real-time dense scene reconstruction system that regresses 3D points from video frames using feed-forward neural networks, without explicitly estimating camera parameters.
- Release pre-trained weights and inference code.
- Release Gradio Demo.
- Release evaluation code.
- Release training code and data.
- Clone SLAM3R
git clone https://github.com/PKU-VCL-3DV/SLAM3R.git
cd SLAM3R
- Prepare environment
conda create -n slam3r python=3.11 cmake=3.14.0
conda activate slam3r
# install torch according to your cuda version
pip install torch==2.5.0 torchvision==0.20.0 torchaudio==2.5.0 --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt
# optional: install XFormers according to your pytorch version, see https://github.com/facebookresearch/xformers
pip install xformers==0.0.28.post2
- Optional: Compile cuda kernels for RoPE
cd slam3r/pos_embed/curope/
python setup.py build_ext --inplace
cd ../../../
- Download the SLAM3R checkpoints for the Image-to-Points model and the Local-to-World model, and place them under
./checkpoints/
To run our demo on Replica dataset, download the sample scene here and unzip it to ./data/Replica/
. Then run the following command to reconstruct the scene from the video images
bash scripts/demo_replica.sh
The results will be stored at ./visualization/
by default.
We also provide a set of images extracted from an in-the-wild captured video. Download it here and unzip it to ./data/wild/
.
Set the required parameter in this script, and then run SLAM3R by using the following command
bash scripts/demo_wild.sh
You can run SLAM3R on your self-captured video with the steps above. Here are some tips for it
If you find our work helpful in your research, please consider citing:
@article{slam3r,
title={SLAM3R: Real-Time Dense Scene Reconstruction from Monocular RGB Videos},
author={Liu, Yuzheng and Dong, Siyan and Wang, Shuzhe and Yang, Yanchao and Fan, Qingnan and Chen, Baoquan},
journal={arXiv preprint arXiv:2412.09401},
year={2024}
}
Our implementation is based on several awesome repositories:
We thank the respective authors for open-sourcing their code.