We research 3D Occupancy Perception for Autonomous Driving

Huaiyuan Xu . Junliang Chen . Shiyu Meng . Yi Wang . Lap-Pui Chau^*

We research 3D Occupancy Perception for Autonomous Driving

This work focuses on 3D dense perception in autonomous driving, encompassing LiDAR-Centric Occupancy Perception, Vision-Centric Occupancy Perception, and Multi-Modal Occupancy Perception. Information fusion techniques for this field are discussed. We believe this will be the most comprehensive survey to date on 3D Occupancy Perception. Please stay tuned!😉😉😉Continually updated.

✨You are welcome to provide us your work with a topic related to 3D occupancy for autonomous driving (involving not only perception, but also applications)!!!

If you discover any missing work or have any suggestions, please feel free to submit a pull request or contact us. We will promptly add the missing papers to this repository.

✨Highlight!!!

[1] A systematically survey for the latest research on 3D occupancy perception in the field of autonomous driving.

[2] The survey provides the taxonomy of 3D occupancy perception, and elaborate on core methodological issues, including network pipelines, multi-source information fusion, and effective network training.

[3] The survey presents evaluations for 3D occupancy perception, and offers detailed performance comparisons. Furthermore, current limitations and future research directions are discussed.

Introduction

3D occupancy perception technology aims to observe and understand dense 3D environments for autonomous vehicles. Owing to its comprehensive perception capability, this technology is emerging as a trend in autonomous driving perception systems, and is attracting significant attention from both industry and academia. Similar to traditional bird’s-eye view (BEV) perception, 3D occupancy perception has the nature of multi-source input and the necessity for information fusion. However, the difference is that it captures vertical structures that are ignored by 2D BEV. In this survey, we review the most recent works on 3D occupancy perception, and provide in-depth analyses of methodologies with various input modalities. Specifically, we summarize general network pipelines, highlight information fusion techniques, and discuss effective network training. We evaluate and analyze the occupancy perception performance of the state-of-the-art on the most popular datasets. Furthermore, challenges and future research directions are discussed. We hope this report will inspire the community and encourage more research work on 3D occupancy perception.

Methods: A Survey

LiDAR-Centric Occupancy Perception

Year	Venue	Paper Title	Link
2024	AAAI	Semantic Complete Scene Forecasting from a 4D Dynamic Point Cloud Sequence	Project Page
2023	T-IV	Occupancy-MAE: Self-supervised Pre-training Large-scale LiDAR Point Clouds with Masked Occupancy Autoencoders	Code
2023	arXiv	PointOcc: Cylindrical Tri-Perspective View for Point-based 3D Semantic Occupancy Prediction	Code
2023	arXiv	LiDAR-based 4D Occupancy Completion and Forecasting	Project Page
2021	T-PAMI	Semantic Scene Completion using Local Deep Implicit Functions on LiDAR Data	-
2021	AAAI	Sparse Single Sweep LiDAR Point Cloud Segmentation via Learning Contextual Shape Priors from Scene Completion	Code
2020	CoRL	S3CNet: A Sparse Semantic Scene Completion Network for LiDAR Point Clouds	-
2020	3DV	LMSCNet: Lightweight Multiscale 3D Semantic Completion	Code

Vision-Centric Occupancy Perception

Year	Venue	Paper Title	Link
2024	IJCAI	Bridging Stereo Geometry and BEV Representation with Reliable Mutual Interaction for Semantic Scene Completion	Code
2024	ICRA	RenderOcc: Vision-Centric 3D Occupancy Prediction with 2D Rendering Supervision	Code
2024	ICRA	MonoOcc: Digging into Monocular Semantic Occupancy Prediction	Code
2024	ICRA	FastOcc: Accelerating 3D Occupancy Prediction by Fusing the 2D Bird’s-Eye View and Perspective View	-
2024	CVPR	Symphonize 3D Semantic Scene Completion with Contextual Instance Queries	Code
2024	CVPR	SparseOcc: Rethinking Sparse Latent Representation for Vision-Based Semantic Occupancy Prediction	Project Page
2024	CVPR	SelfOcc: Self-Supervised Vision-Based 3D Occupancy Prediction	Project Page
2024	CVPR	PanoOcc: Unified Occupancy Representation for Camera-based 3D Panoptic Segmentation	Code
2024	CVPR	Not All Voxels Are Equal: Hardness-Aware Semantic Scene Completion with Self-Distillation	Code
2024	CVPR	COTR: Compact Occupancy TRansformer for Vision-based 3D Occupancy Prediction	Code
2024	CVPR	Collaborative Semantic Occupancy Prediction with Hybrid Feature Fusion in Connected Automated Vehicles	Project Page
2024	CVPR	Cam4DOcc: Benchmark for Camera-Only 4D Occupancy Forecasting in Autonomous Driving Applications	Code
2024	CVPR	Boosting Self-Supervision for Single-View Scene Completion via Knowledge Distillation	Project Page
2024	CVPR	DriveWorld: 4D Pre-trained Scene Understanding viaWorld Models for Autonomous Driving	-
2024	AAAI	Regulating Intermediate 3D Features for Vision-Centric Autonomous Driving	Code
2024	AAAI	One at a Time: Progressive Multi-step Volumetric Probability Learning for Reliable 3D Scene Perception	-
2024	RA-L	Multi-Camera Unified Pre-Training via 3D Scene Reconstruction	Code
2024	arXiv	OccFlowNet: Towards Self-supervised Occupancy Estimation via Differentiable Rendering and Occupancy Flow	-
2024	arXiv	OccFiner: Offboard Occupancy Refinement with Hybrid Propagation	-
2024	arXiv	InverseMatrixVT3D: An Efficient Projection Matrix-Based Approach for 3D Occupancy Prediction	Code
2024	arXiv	Unified Spatio-Temporal Tri-Perspective View Representation for 3D Semantic Occupancy Prediction	Project Page
2024	arXiv	ViewFormer: Exploring Spatiotemporal Modeling for Multi-View 3D Occupancy Perception via View-Guided Transformers	-
2023	T-IV	3DOPFormer: 3D Occupancy Perception from Multi-Camera Images with Directional and Distance Enhancement	Code
2023	NeurIPS	POP-3D: Open-Vocabulary 3D Occupancy Prediction from Images	Project Page
2023	NeurIPS	Occ3D: A Large-Scale 3D Occupancy Prediction Benchmark for Autonomous Driving	Project Page
2023	ICCV	SurroundOcc: Multi-Camera 3D Occupancy Prediction for Autonomous Driving	Project Page
2023	ICCV	Scene as Occupancy	Code
2023	ICCV	OccFormer: Dual-path Transformer for Vision-based 3D Semantic Occupancy Prediction	Code
2023	ICCV	NDC-Scene: Boost Monocular 3D Semantic Scene Completion in Normalized Device Coordinates Space	Code
2023	CVPR	VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion	Code
2023	CVPR	Tri-Perspective View for Vision-Based 3D Semantic Occupancy Prediction	Project Page
2023	arXiv	SSCBench: Monocular 3D Semantic Scene Completion Benchmark in Street Views	Code
2023	arXiv	SOccDPT: Semi-Supervised 3D Semantic Occupancy from Dense Prediction Transformers trained under memory constraints	-
2023	arXiv	OVO: Open-Vocabulary Occupancy	Code
2023	arXiv	OctreeOcc: Efficient and Multi-Granularity Occupancy Prediction Using Octree Queries	Code
2023	arXiv	OccWorld: Learning a 3D OccupancyWorld Model for Autonomous Driving	Project Page
2023	arXiv	OccNeRF: Self-Supervised Multi-Camera Occupancy Prediction with Neural Radiance Fields	Project Page
2023	arXiv	OccDepth: A Depth-Aware Method for 3D Semantic Scene Completion	Code
2023	arXiv	Fully Sparse 3D Occupancy Prediction	Code
2023	arXiv	FlashOcc: Fast and Memory-Efficient Occupancy Prediction via Channel-to-Height Plugin	Code
2023	arXiv	FB-OCC: 3D Occupancy Prediction based on Forward-Backward View Transformation	Code
2023	arXiv	DepthSSC: Depth-Spatial Alignment and Dynamic Voxel Resolution for Monocular 3D Semantic Scene Completion	-
2023	arXiv	Camera-based 3D Semantic Scene Completion with Sparse Guidance Network	Code
2023	arXiv	A Simple Framework for 3D Occupancy Estimation in Autonomous Driving	Code
2023	arXiv	UniWorld: Autonomous Driving Pre-training via World Models	Code
2022	CVPR	MonoScene: Monocular 3D Semantic Scene Completion	Project Page

Multi-Modal Occupancy Perception

Year	Venue	Paper Title	Code
2024	arXiv	Real-time 3D semantic occupancy prediction for autonomous vehicles using memory-efficient sparse convolution	-
2024	arXiv	OccGen: Generative Multi-modal 3D Occupancy Prediction for Autonomous Driving	Project Page
2024	arXiv	OccFusion: A Straightforward and Effective Multi-Sensor Fusion Framework for 3D Occupancy Prediction	-
2024	arXiv	Co-Occ: Coupling Explicit Feature Fusion with Volume Rendering Regularization for Multi-Modal 3D Semantic Occupancy Prediction	Project Page
2024	arXiv	Unleashing HyDRa: Hybrid Fusion, Depth Consistency and Radar for Unified 3D Perception	-
2023	ICCV	OpenOccupancy: A Large Scale Benchmark for Surrounding Semantic Occupancy Perception	Code

3D Occupancy Datasets

Dataset	Year	Venue	Modality	# of Classes	Flow	Link
OpenScene	2024	CVPR 2024 Challenge	Camera	-	✔️	Intro.
Cam4DOcc	2024	CVPR	Camera+LiDAR	2	✔️	Intro.
Occ3D	2024	NeurIPS	Camera	14 (Occ3D-Waymo), 16 (Occ3D-nuScenes)	❌	Intro.
OpenOcc	2023	ICCV	Camera	16	❌	Intro.
OpenOccupancy	2023	ICCV	Camera+LiDAR	16	❌	Intro.
SurroundOcc	2023	ICCV	Camera	16	❌	Intro.
OCFBench	2023	arXiv	LiDAR	-(OCFBench-Lyft), 17(OCFBench-Argoverse), 25(OCFBench-ApolloScape), 16(OCFBench-nuScenes)	❌	Intro.
SSCBench	2023	arXiv	Camera	19(SSCBench-KITTI-360), 16(SSCBench-nuScenes), 14(SSCBench-Waymo)	❌	Intro.
SemanticKITT	2019	ICCV	Camera+LiDAR	19(Semantic Scene Completion task)	❌	Intro.

Occupancy-based Applications

Segmentation

Specific Task	Year	Venue	Paper Title	Link
BEV Segmentation	2024	arXiv	OccFeat: Self-supervised Occupancy Feature Prediction for Pretraining BEV Segmentation Networks	-
3D Panoptic Segmentation	2024	CVPR	PanoOcc: Unified Occupancy Representation for Camera-based 3D Panoptic Segmentation	Code

Flow

Specific Task	Year	Venue	Paper Title	Link
3D Flow Prediction	2024	CVPR	Cam4DOcc: Benchmark for Camera-Only 4D Occupancy Forecasting in Autonomous Driving Applications	Code

Detection

Specific Task	Year	Venue	Paper Title	Link
3D Object Detection	2024	CVPR	Learning Occupancy for Monocular 3D Object Detection	Code
3D Object Detection	2023	arXiv	SOGDet: Semantic-Occupancy Guided Multi-view 3D Object Detection	Code

Generation

Specific Task	Year	Venue	Paper Title	Link
Scene Generation	2024	CVPR	SemCity: Semantic Scene Generation with Triplane Diffusion	Code

Unified Framework

Specific Tasks	Year	Venue	Paper Title	Link
Occupancy Prediction, 3D Object Detection, Online Mapping, Multi-object Tracking, Motion Prediction, Motion Planning	2024	CVPR	DriveWorld: 4D Pre-trained Scene Understanding viaWorld Models for Autonomous Driving	-
Occupancy Prediction, 3D Object Detection, BEV segmentation, Motion Planning	2023	ICCV	Scene as Occupancy	Code

Cite The Survey

If you find our survey and repository useful for your research project, please consider citing our paper:

@misc{xu2024survey,
      title={A Survey on Occupancy Perception for Autonomous Driving: The Information Fusion Perspective}, 
      author={Huaiyuan Xu and Junliang Chen and Shiyu Meng and Yi Wang and Lap-Pui Chau},
      year={2024},
      eprint={2405.05173},
      archivePrefix={arXiv}
}

Contact

[email protected]

[email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
assets		assets
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

We research 3D Occupancy Perception for Autonomous Driving

✨You are welcome to provide us your work with a topic related to 3D occupancy for autonomous driving (involving not only perception, but also applications)!!!

✨Highlight!!!

Introduction

Summary of Contents

Methods: A Survey

LiDAR-Centric Occupancy Perception

Vision-Centric Occupancy Perception

Multi-Modal Occupancy Perception

3D Occupancy Datasets

Occupancy-based Applications

Segmentation

Flow

Detection

Generation

Unified Framework

Cite The Survey

Contact

About

Releases

Packages

ZL-Su/3D-Occupancy-Perception

Folders and files

Latest commit

History

Repository files navigation

We research 3D Occupancy Perception for Autonomous Driving

✨You are welcome to provide us your work with a topic related to 3D occupancy for autonomous driving (involving not only perception, but also applications)!!!

✨Highlight!!!

Introduction

Summary of Contents

Methods: A Survey

LiDAR-Centric Occupancy Perception

Vision-Centric Occupancy Perception

Multi-Modal Occupancy Perception

3D Occupancy Datasets

Occupancy-based Applications

Segmentation

Flow

Detection

Generation

Unified Framework

Cite The Survey

Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages