This is the official implementation of:
Fully Sparse 3D Object Detection and Embracing Single Stride 3D Object Detector with Sparse Transformer.
🔥 FSD Preview Release
- Code of FSD on Waymo is released. See
./configs/fsd/fsd_waymoD1_1x.py
- We provide the tools for processing Argoverse 2 dataset in
./tools/argo
. We will release the instruction and configs of Argo2 model later. - A very fast Waymo evaluation, see Usage section for detailed instructions. The whole evaluation process of FSD on Waymo costs less than 10min with 8 2080Ti GPUs.
- We cannot distribute model weights of FSD on Waymo due to the license. Users could contact us for the private model weights.
- Before using this repo, please install TorchEx, SpConv2 (SpConv 1.x is not supported) and torch_scatter.
NEWS
- [22-09-19] The code of FSD is released here.
- [22-09-15] 🔥 FSD is accepted at NeurIPS 2022.
- [22-06-06] Support SST with CenterHead, cosine similarity in attention, faster SSTInputLayer. See Usage for details.
- [22-03-02] 🔥 SST is accepted at CVPR 2022.
- Support Weighted NMS (CPU version) in RangeDet, improving performance of vehicle class by ~1 AP.
See
Usage
section. - We refactored the code to provide more clear function prototypes and a better understanding. See
./configs/sst_refactor
- Supported voxel-based region partition in
./configs/sst_refactor
. Users can easily use voxel-based SST by modifying therecover_bev
function in the backbone. - Waymo Leaderboard results updated in SST_v1
PyTorch >= 1.9 is recommended for a better support of the checkpoint technique.
Our implementation is based on MMDetection3D, so just follow their getting_started and simply run the script: run.sh
.
ATTENTION: It is highly recommended to check the data version if users generate data with the official MMDetection3D. MMDetection3D refactors its coordinate definition after v1.0. A hotfix is using our code to re-generate the waymo_dbinfo_train.pkl
- Copy
tools/idx2timestamp.pkl
andtools/idx2contextname.pkl
to./data/waymo/kitti_format/
. - Passing the argument
--eval fast
(Seerun.sh
). This argument will directly convert network outputs to Waymo.bin
format, which is much faster than the old way. - Users could further build the multi-thread Waymo evaluation tool (link) for faster evaluation.
FSD requires segmentation first, so we use an EnableFSDDetectionHookIter
to enable the detection part after a segmentation warmup.
If the warmup parameter is not properly modified (which is likely in your customized dataset), the memory cost might be large and the training time will be unstable (caused by CCL in CPU, we will replace it with the GPU version later).
If users do not want to waste time on the EnableFSDDetectionHookIter
, users could first use our fast pretrain config (e.g., fsd_sst_encoder_pretrain
) for a once-for-all warmup. The script tools/model_converters/fsd_pretrain_converter.py
could convert the pretrain checkpoint, which can be loaded for FSD training (with a load_from='xx'
in config). With the once-for-all pretrain, users could adopt a much short EnableFSDDetectionHookIter
.
SST based FSD converges slower than SpConv based FSD, so we recommend users adopt the fast pretrain for SST based FSD.
We only provide the single-stage model here, as for our two-stage models, please follow LiDAR-RCNN. It's also a good choice to apply other powerful second stage detectors to our single-stage SST.
We borrow Weighted NMS from RangeDet and observe ~1 AP improvement on our best Vehicle model. To use it, you are supposed to clone RangeDet, and simply run pip install -v -e .
in its root directory. Then refer to config/sst/sst_waymoD5_1x_car_8heads_wnms.py
to modify your config and enable Weight NMS. Note we only implement the CPU version for now, so it is relatively slow. Do NOT use it on 3-class models, which will lead to performance drop.
A basic config of SST with CenterHead: ./configs/sst_refactor/sst_waymoD5_1x_3class_centerhead.py
, which has significant improvement in Vehicle class.
To enable faster SSTInputLayer, clone https://github.com/Abyssaledge/TorchEx, and run pip install -v .
.
Validation: please refer to this page. Test: please refer to this submission
#Sweeps | Veh_L1 | Ped_L1 | Cyc_L1 | Veh_L2 | Ped_L2 | Cyc_L2 | |
---|---|---|---|---|---|---|---|
SST_TS_3f | 3 | 80.99 | 83.30 | 75.69 | 73.08 | 76.93 | 73.22 |
Please visit the website for detailed results: SST_v1
One stage model on Waymo validation split (refer to this page for the detailed performance of CenterHead SST)
#Sweeps | Veh_L1 | Ped_L1 | Cyc_L1 | Veh_L2 | Ped_L2 | Cyc_L2 | |
---|---|---|---|---|---|---|---|
SST_1f | 1 | 73.57 | 80.01 | 70.72 | 64.80 | 71.66 | 68.01 |
SST_1f_center (4 SST blocks) | 1 | 75.40 | 80.28 | 71.58 | 66.76 | 72.63 | 68.89 |
SST_3f | 3 | 75.16 | 83.24 | 75.96 | 66.52 | 76.17 | 73.59 |
Note that we train the 3 classes together, so the performance above is a little bit lower than that reported in our paper.
Please consider citing our work as follows if it is helpful.
@inproceedings{fan2022embracing,
title={{Embracing Single Stride 3D Object Detector with Sparse Transformer}},
author={Fan, Lue and Pang, Ziqi and Zhang, Tianyuan and Wang, Yu-Xiong and Zhao, Hang and Wang, Feng and Wang, Naiyan and Zhang, Zhaoxiang},
booktitle={CVPR},
year={2022}
}
@article{fan2022fully,
title={{Fully Sparse 3D Object Detection}},
author={Fan, Lue and Wang, Feng and Wang, Naiyan and Zhang, Zhaoxiang},
journal={arXiv preprint arXiv:2207.10035},
year={2022}
}
This project is based on the following codebases.
Thank the authors of CenterPoint for providing their detailed results.