ST-DepthNet: A spatio-temporal deep network for depth completion using a single non-repetitive circular scanning Lidar

Supplementary material to our paper in the IEEE Robotics and Automation Letters (RAL).

Motivation

The project's purpose is effective deep learning based depth completion for sparse measurements captured by the Livox AVIA sensor.

A sparse measurement by the Livox AVIA sensor (left) and the completed depth image by ST-DepthNet (right)

Citation

If you found this work helpful for your research, or use some part of the code or the datasets, please cite our paper:

@article{st-depthnet,
	author={Zováthi, Örkény and Pálffy, Balázs and Jankó, Zsolt and Benedek, Csaba},
  	journal={IEEE Robotics and Automation Letters}, 
  	title={ST-DepthNet: A Spatio-Temporal Deep Network for Depth Completion Using a Single Non-Repetitive Circular Scanning Lidar}, 
  	year={2023},
  	volume={8},
  	number={6},
  	pages={3270-3277},
  	doi={10.1109/LRA.2023.3266670}}
}

LivoxCarla Dataset

We provide training, validation and test data with ground truth information from the Carla simulator. The dataset consists of 11726 randomly sampled sparse input-dense output range image pairs. Each sample consists of a sequence of five consecutive range images with height and width of 400 pixels. The dataset is arranged in three folders named:

Train: 10000 sparse samples with ground truth data,
Validation: 500 sparse samples with ground truth data,
Test: 1226 sparse samples with ground truth data.

Dataset availability

The LivoxCarla Dataset can be downloaded from this link.

Dataset structure

We provide the data both in raw image and in video format. The video format can be directly used by our code for training and inference, just copy the .avi files in the dataset subfolder of the project.

├── LivoxCarla                    # Main dataset folder
│   ├── Train                     # Contains 10000 samples
│   │   ├── train.zip             # Samples in raw image format
│   │   ├── train.avi             # Samples in video format (directly usable by the code)
│   ├── Validation                # Contains 500 samples 
│   │   ├── validation.zip        # Samples in raw image format
│   │   ├── validation.avi        # Samples in video format (directly usable by the code)
│   ├── Test                      # Contains 1226 samples 
│   │   ├── test.zip              # Samples in raw image format
│   │   ├── test.avi              # Samples in video format (directly usable by the code)
└── ...

Creating the dataset

The dataset was generated from the Carla simulator that gives the opportunity to export perfect depth images without any distortion or blurring. To simulate realistic Livox AVIA measurements, the dense depth images were sampled with rosetta scanning pattern of the Livox AVIA sensor. During the whole data recording, the capturing platform (a simulated) vehicle was dynamically moving. To augment on the extractable information (e.g., vary the ground level), the capturing sensor’s position was randomly rotated along the up axis by [−22.5°, 22.5°], and its height was randomly adjusted between [1.5m, 2.5m]. The final dataset consists of 11726 randomly sampled input-output data pairs.

In each sample, the input data is a sparse depth image sequence, that consists of five consecutive sparse depth images, each sampled after 200 ms. The implementation of the process is shown in the figure below, where the patterns of the Livox AVIA sensor (displayed in the middle) are used to filter the depth image exported from the simulator resulting in realistic, Livox-like depth images.

The output (ground truth) data was generated at the end of each input sequence using the mask with the full field of view of the Livox AVIA sensor.

A sample of the training data presenting the five consequtive sparse depth images and one ground truth data is displayed in the following video:

ExampleOfDataVideo.mp4

LivoxBudapest Dataset for real-life testing

We also provide three real-life measurement sequences in video format and code for testing the trained models in real-world scenarios. The LivoxBudapest Dataset can be downloaded from this link.

Demo videos

The depth map prediction results by the considered reference techniques and the ST-DepthNet method can be viewed for the complete sequences in the enclosed YouTube video streams, which also contain an RGB image channel for visual verification.

Methods displayed in the videos:

Narrow integration time (200ms): Measurements of the Livox Avia sensor grouped by each 200 ms.
Large integration time (1000ms): Measurements of the Livox Avia sensor grouped by each 1000 ms.
IP-Basic++: An improved version of the IP-Basic reference depth completion method.
Sparse-to-Dense: A GAN-based reference depth completion method.
Proposed ST-DepthNet: Depth estimations by the proposed model.

There is an RGB camera recording in each video, only for better visualization. A few frame difference can be experienced compared to the depth streams.

Example video

DepthImageCompletionOnNarrowStreet.mp4

Implementation

Environment setup

The project was implemented on Ubuntu 18.04 with CUDA 10.2 (with compatible CUDNN) using a 8GB GeForce 1080 Ti GPU. All codes were implemented in python 3.7.0 with packages tensorflow-gpu 1.13.1 and keras-gpu 2.3.1 using conda virtual environment.

Training

To train the proposed ST-DepthNet model on our LivoxCarla training dataset, simply run:

python train.py --n_epochs=10 --batch_size=1 --image_size=400 --save_model=./models/model

Pretrained model

A pretrained model of the proposed ST-DepthNet can be downloaded from the following link.

Inference

There are three ways to plot the model's predictions using the visualize.py file.

For plotting input, ground truth and prediction in one row, run:

python visualize.py --plot_type=0

For plotting and saving only the model predictions, run:

python visualize.py --plot_type=1

For plotting the prediction of two model variants for comparison, run:

python visualize.py --plot_type=2

Additional scripts

Additional scripts are provided in the utils folder for supporting more easier understanding and usage of this repository. Using this may require installing further libraries like openCV.

To project the grayscale depth images into a 3D point clouds, run the Depth_to_cloud.py file:

python Depth_to_cloud.py ./path/depth_image.png

For constructing the required video formats for your custom data, you can use the Video_maker.py file.

Authorship declaration

This repository was implemented in the Machine Perception Research Laboratory, Institute of Computer Science and Control (SZTAKI), Budapest.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
utils		utils
LICENSE		LICENSE
README.md		README.md
data_utils.py		data_utils.py
model.py		model.py
train.py		train.py
visualize.py		visualize.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ST-DepthNet: A spatio-temporal deep network for depth completion using a single non-repetitive circular scanning Lidar

Motivation

Citation

LivoxCarla Dataset

Dataset availability

Dataset structure

Creating the dataset

LivoxBudapest Dataset for real-life testing

Demo videos

Example video

Implementation

Environment setup

Training

Pretrained model

Inference

Additional scripts

Authorship declaration

About

Releases

Packages

Contributors 2

Languages

License

sztaki-geocomp/ST-DepthNet

Folders and files

Latest commit

History

Repository files navigation

ST-DepthNet: A spatio-temporal deep network for depth completion using a single non-repetitive circular scanning Lidar

Motivation

Citation

LivoxCarla Dataset

Dataset availability

Dataset structure

Creating the dataset

LivoxBudapest Dataset for real-life testing

Demo videos

Example video

Implementation

Environment setup

Training

Pretrained model

Inference

Additional scripts

Authorship declaration

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages