This is the official implementation of "V2XPnP: Vehicle-to-Everything Spatio-Temporal Fusion for Multi-Agent Perception and Prediction", Zewei Zhou, Hao Xiang, Zhaoliang Zheng, Seth Z. Zhao, Mingyue Lei, Yun Zhang, Tianhui Cai, Xinyi Liu, Johnson Liu, Maheswari Bajji, Jacob Pham, Xin Xia, Zhiyu Huang, Bolei Zhou, Jiaqi Ma
V2XPnP is the first open-source V2X spatio-temporal fusion framework for cooperative perception and prediction. This framework combines the intermediate fusion strategy and one-step communication and integrates diverse attention fusion modules in the unified Transformer architecture for V2X spatial-temporal information. Our benchmark model zoo includes 11 SOAT models across no fusion, early fusion, late fusion, and intermediate fusion.
V2XPnP Sequential Dataset is the first large-scale, real-world V2X sequential dataset featuring multiple agents and all V2X collaboration modes, ie, vehicle-to-vehicle (V2V), infrastructure-to-infrastructure (I2I), vehicle-centric (VC), and infrastructure-centric (IC).
Supported by the UCLA Mobility Lab
- Support both simulation and real-world V2X dataset
- Multiple Tasks supported
- Cooperative perception and prediction
- Cooperative single-frame perception
- Cooperative temporal perception
- Cooperative prediction
- SOTA model supported
- No Fusion (Decoupled)
- FaF [CVPR2018] (No Fusion-End2end)
- Early Fusion
- Late Fusion (Decoupled)
- F-Cooper [SEC2019]
- V2VNet [ECCV2020]
- DiscoNet [NeurIPS 2021]
- V2X-ViT [ECCV 2022]
- CoBEVFlow [NeurIPS 2023]
- FFNet [NeurIPs 2023]
- V2XPnP [Ours]
2024/06
: Sample Data of V2XPnP in Google Drive2025/03
: V2XPnP Dataset 1.0 (68 scenarios)2025/05
: V2XPnP Dataset 2.0 (Whole 100 scenarios)2025/07
: V2XPnP Codebase - Official Version 1.0
The sample data of V2XPnP Sequential Dataset can be accessed in Google Drive, and we will release all the data later. The sequential perceptions data format follows the OpenCOOD, and the trajectory dataset records the whole trajectory of each agent in each scenario.
V2XPnP belongs to the OpenCDA ecosystem family. The codebase is built upon OpenCOOD in the OpenCDA ecosystem family, and the V2X-Real, another project in OpenCDA, serves as one of the data sources for this project.
If you find this repository useful for your research, please consider giving us a star 🌟 and citing our paper.
@article{zhou2024v2xpnp,
title={V2XPnP: Vehicle-to-Everything Spatio-Temporal Fusion for Multi-Agent Perception and Prediction},
author={Zhou, Zewei and Xiang, Hao and Zheng, Zhaoliang and Zhao, Seth Z. and Lei, Mingyue and Zhang, Yun and Cai, Tianhui and Liu, Xinyi and Liu, Johnson and Bajji, Maheswari and Pham, Jacob and Xia, Xin and Huang, Zhiyu and Zhou, Bolei and Ma, Jiaqi},
journal={arXiv preprint arXiv:2412.01812},
year={2024}
}