ST-DepthNet: A spatio-temporal deep network for depth completion using a single non-repetitive circular scanning Lidar
Supplementary material to our paper in the IEEE Robotics and Automation Letters (RAL).
The project's purpose is effective deep learning based depth completion for sparse measurements captured by the Livox AVIA sensor.
A sparse measurement by the Livox AVIA sensor (left) and the completed depth image by ST-DepthNet (right)
If you found this work helpful for your research, or use some part of the code or the datasets, please cite our paper:
@article{st-depthnet,
author={Zováthi, Örkény and Pálffy, Balázs and Jankó, Zsolt and Benedek, Csaba},
journal={IEEE Robotics and Automation Letters},
title={ST-DepthNet: A Spatio-Temporal Deep Network for Depth Completion Using a Single Non-Repetitive Circular Scanning Lidar},
year={2023},
volume={8},
number={6},
pages={3270-3277},
doi={10.1109/LRA.2023.3266670}}
}
We provide training, validation and test data with ground truth information from the Carla simulator. The dataset consists of 11726 randomly sampled sparse input-dense output range image pairs. Each sample consists of a sequence of five consecutive range images with height and width of 400 pixels. The dataset is arranged in three folders named:
- Train: 10000 sparse samples with ground truth data,
- Validation: 500 sparse samples with ground truth data,
- Test: 1226 sparse samples with ground truth data.
The LivoxCarla Dataset can be downloaded from this link.
We provide the data both in raw image and in video format. The video format can be directly used by our code for training and inference, just copy the .avi files in the dataset subfolder of the project.
├── LivoxCarla # Main dataset folder
│ ├── Train # Contains 10000 samples
│ │ ├── train.zip # Samples in raw image format
│ │ ├── train.avi # Samples in video format (directly usable by the code)
│ ├── Validation # Contains 500 samples
│ │ ├── validation.zip # Samples in raw image format
│ │ ├── validation.avi # Samples in video format (directly usable by the code)
│ ├── Test # Contains 1226 samples
│ │ ├── test.zip # Samples in raw image format
│ │ ├── test.avi # Samples in video format (directly usable by the code)
└── ...
The dataset was generated from the Carla simulator that gives the opportunity to export perfect depth images without any distortion or blurring. To simulate realistic Livox AVIA measurements, the dense depth images were sampled with rosetta scanning pattern of the Livox AVIA sensor. During the whole data recording, the capturing platform (a simulated) vehicle was dynamically moving. To augment on the extractable information (e.g., vary the ground level), the capturing sensor’s position was randomly rotated along the up axis by [−22.5°, 22.5°], and its height was randomly adjusted between [1.5m, 2.5m]. The final dataset consists of 11726 randomly sampled input-output data pairs.
- In each sample, the input data is a sparse depth image sequence, that consists of five consecutive sparse depth images, each sampled after 200 ms. The implementation of the process is shown in the figure below, where the patterns of the Livox AVIA sensor (displayed in the middle) are used to filter the depth image exported from the simulator resulting in realistic, Livox-like depth images.
- The output (ground truth) data was generated at the end of each input sequence using the mask with the full field of view of the Livox AVIA sensor.
A sample of the training data presenting the five consequtive sparse depth images and one ground truth data is displayed in the following video:
ExampleOfDataVideo.mp4
We also provide three real-life measurement sequences in video format and code for testing the trained models in real-world scenarios. The LivoxBudapest Dataset can be downloaded from this link.
The depth map prediction results by the considered reference techniques and the ST-DepthNet method can be viewed for the complete sequences in the enclosed YouTube video streams, which also contain an RGB image channel for visual verification.
Methods displayed in the videos:
- Narrow integration time (200ms): Measurements of the Livox Avia sensor grouped by each 200 ms.
- Large integration time (1000ms): Measurements of the Livox Avia sensor grouped by each 1000 ms.
- IP-Basic++: An improved version of the IP-Basic reference depth completion method.
- Sparse-to-Dense: A GAN-based reference depth completion method.
- Proposed ST-DepthNet: Depth estimations by the proposed model.
There is an RGB camera recording in each video, only for better visualization. A few frame difference can be experienced compared to the depth streams.
DepthImageCompletionOnNarrowStreet.mp4
The project was implemented on Ubuntu 18.04 with CUDA 10.2 (with compatible CUDNN) using a 8GB GeForce 1080 Ti GPU. All codes were implemented in python 3.7.0 with packages tensorflow-gpu 1.13.1 and keras-gpu 2.3.1 using conda virtual environment.
To train the proposed ST-DepthNet model on our LivoxCarla training dataset, simply run:
python train.py --n_epochs=10 --batch_size=1 --image_size=400 --save_model=./models/model
A pretrained model of the proposed ST-DepthNet can be downloaded from the following link.
There are three ways to plot the model's predictions using the visualize.py file.
- For plotting input, ground truth and prediction in one row, run:
python visualize.py --plot_type=0
- For plotting and saving only the model predictions, run:
python visualize.py --plot_type=1
- For plotting the prediction of two model variants for comparison, run:
python visualize.py --plot_type=2
Additional scripts are provided in the utils folder for supporting more easier understanding and usage of this repository. Using this may require installing further libraries like openCV.
- To project the grayscale depth images into a 3D point clouds, run the Depth_to_cloud.py file:
python Depth_to_cloud.py ./path/depth_image.png
- For constructing the required video formats for your custom data, you can use the Video_maker.py file.
This repository was implemented in the Machine Perception Research Laboratory, Institute of Computer Science and Control (SZTAKI), Budapest.