This repository is a paper digest of recent advances in collaborative / cooperative / multi-agent perception for V2I / V2V / V2X autonomous driving scenario. Papers are listed in alphabetical order of the first character.
🔗Jump to: [Dataset and Simulator] [Method and Framework]
- (Talk) Robust Collaborative Perception against Communication Interruption [video], Uncertainty Quantification of Collaborative Detection for Self-Driving [video], Collaborative and Adversarial 3D Perception for Autonomous Driving [video], Vehicle-to-Vehicle Communication for Self-Driving [video], Adversarial Robustness for Self-Driving [video], 2022 1st Cooperative Perception Workshop Playback [video], 基于群体协作的超视距态势感知 [video], 协同自动驾驶:仿真与感知 [video], 新一代协作感知Where2comm减少通信带宽十万倍 [video], 基于V2X的多源协同感知技术初探 [video], 面向车路协同的群智机器网络 [video], IACS 2023 协同感知PhD Sharing [video], CICV 2022 数据驱动的车路协同专题 [video]
- (Survey) Collaborative Perception in Autonomous Driving: Methods, Datasets and Challenges [paper], A Survey and Framework of Cooperative Perception: From Heterogeneous Singleton to Hierarchical Cooperation [paper]
- (Library) OpenCOOD: Open Cooperative Detection Framework for Autonomous Driving [code] [doc], CoPerception: SDK for Collaborative Perception [code] [doc], OpenCDA: Simulation Tool Integrated with Prototype Cooperative Driving Automation [code] [doc]
- (People) Runsheng Xu@UCLA [web], Yiming Li@NYU [web], Hang Qiu@Waymo [web]
- (Workshop) ICRA 2023 [web], MFI 2022 [web], ITSC 2020 [web]
- (Competition) VIC3D Object Detection Challenge 清华AIR-百度Apollo车路协同自动驾驶算法挑战赛 [info]
- (Background) Current Approaches and Future Directions for Point Cloud Object Detection in Intelligent Agents [video], 3D Object Detection for Autonomous Driving: A Review and New Outlooks [paper], DACOM: Learning Delay-Aware Communication for Multi-Agent Reinforcement Learning [video], A Survey of Multi-Agent Reinforcement Learning with Communication [paper]
- V2XSet (consider vehicles and infrastructures, pose error and time delay)
Method | Source | Ideal [email protected] | Ideal [email protected] | Noisy [email protected] | Noisy [email protected] |
---|---|---|---|---|---|
MPDA [ICRA'23] | link | 🏆73.4🌟 | - | - | - |
MVRF [PAAP'22] | link | 🏆71.5⭐ | 🏆88.9🌟 | 🏆61.9🌟 | 🏆84.3🌟 |
V2X-ViT [ECCV'22] | link | 71.2 | 🏆88.2⭐ | 🏆61.4⭐ | 🏆83.6⭐ |
DiscoNet [NeurIPS'21] | link | 69.5 | 84.4 | 54.1 | 79.8 |
F-Cooper [SEC'19] | link | 68.0 | 84.0 | 46.9 | 71.5 |
V2VNet [ECCV'20] | link | 67.7 | 84.5 | 49.3 | 79.1 |
AttFuse [ICRA'22] | link | 66.4 | 80.7 | 48.7 | 70.9 |
CoBEVT [CoRL'22] | link | 66.0 | 84.9 | 54.3 | 81.1 |
Where2comm [NeurIPS'22] | link | 65.4 | 85.5 | 53.4 | 82.0 |
=== | === | === | === | === | === |
Early Fusion (Upper Bound) Cooper [ICDCS'19] | link | 71.0 | 81.9 | 38.4 | 72.0 |
Late Fusion | link | 62.0 | 72.7 | 30.7 | 54.9 |
No Fusion (Lower Bound) | link | 40.2 | 60.6 | 40.2 | 60.6 |
- OPV2V (consider adaptation ability by a digital town with realistic configs)
Method | Source | Default [email protected] | Default [email protected] | Culver [email protected] | Culver [email protected] |
---|---|---|---|---|---|
AdaFusion [WACV'23] | link | 🏆85.6🌟 | 🏆91.6⭐ | 🏆79.0🌟 | 🏆88.0⭐ |
FuseBEVT [CoRL'22] | link | 🏆85.2⭐ | - | - | - |
V2VAM [Arxiv'22] | link | 84.9 | 🏆92.0🌟 | 73.1 | 🏆89.3🌟 |
CoBEVT [CoRL'22] | link | 83.6 | 91.4 | 74.8 | 87.7 |
DiscoNet [NeurIPS'21] | link | 83.6 | 89.9 | - | - |
V2X-ViT [ECCV'22] | link | 82.6 | 89.1 | 73.7 | 87.3 |
V2VNet [ECCV'20] | link | 82.2 | 89.7 | 73.4 | 86.0 |
FPV-RCNN [RAL'22] | link | 82.0 | - | 🏆76.3⭐ | - |
AttFuse [ICRA'22] | link | 81.5 | 90.8 | 73.5 | 85.4 |
MAMP [ICRA'23] | link | 81.3 | - | - | - |
F-Cooper [SEC'19] | link | 79.0 | 88.7 | 72.8 | 84.6 |
V2VAM+LCRN [Arxiv'22] | link | 78.3 | 88.7 | 70.9 | 87.1 |
=== | === | === | === | === | === |
Early Fusion (Upper Bound) Cooper [ICDCS'19] | link | 80.0 | 89.1 | 69.6 | 82.9 |
Late Fusion | link | 78.1 | 85.8 | 66.8 | 79.9 |
No Fusion (Lower Bound) | link | 60.2 | 67.9 | 47.1 | 55.7 |
- V2X-Sim 2.0 (multi-modality multi-agent data for detection, tracking and segmentation)
Method | Source | Detection [email protected] | Detection [email protected] |
---|---|---|---|
Where2comm [NeurIPS'22] | link | 🏆74.1🌟 | 🏆83.8🌟 |
FPV-RCNN [RAL'22] | link | 🏆72.1⭐ | 78.7 |
V2X-ViT [ECCV'22] | link | 68.1 | 🏆79.2⭐ |
Double-M Quantification [ICRA'23] | link | 66.4 | 70.4 |
DiscoNet [NeurIPS'21] | link | 63.4 | 69.0 |
AttFuse [ICRA'22] | link | 62.9 | 76.0 |
V2VNet [ECCV'20] | link | 62.8 | 68.4 |
CoAlign [ICRA'23] | link | 60.7 | 73.9 |
STAR [CoRL'22] | link | 57.2 | 62.8 |
Robust V2V [CoRL'20] | link | 56.0 | 69.3 |
F-Cooper [SEC'19] | link | 51.3 | 62.7 |
MASH [IROS'21] | link | 49.6 | 62.2 |
When2com [CVPR'20] | link | 39.9 | 44.0 |
Who2com [ICRA'20] | link | 39.9 | 44.0 |
=== | === | === | === |
Early Fusion (Upper Bound) Cooper [ICDCS'19] | link | 67.0 | 70.4 |
Late Fusion | link | 39.1 | 44.0 |
No Fusion (Lower Bound) | link | 44.2 | 49.9 |
- The results above are directly borrowed from publicly accessible papers. Since some of the results here are reported by the following papers instead of the original ones, the most reliable data source links are also given. The best effort is tried to ensure that all the collected benchmark results are in the same training and testing settings (if provided).
- OPV2V Default
Method | [email protected] | [email protected] | [email protected] |
---|---|---|---|
V2VNet [ECCV'20] | 🏆84.6🌟 | 🏆94.2🌟 | 🏆94.7⭐ |
AdaFusion [WACV'23] | 🏆83.6⭐ | 93.6 | 94.1 |
FuseBEVT [CoRL'22] | 83.3 | 93.0 | 93.7 |
Where2comm [NeurIPS'22] | 82.3 | 93.5 | 94.0 |
DiscoNet [NeurIPS'21] | 82.3 | 93.4 | 94.2 |
V2X-ViT [ECCV'22] | 81.5 | 🏆94.1⭐ | 🏆94.8🌟 |
F-Cooper [SEC'19] | 81.4 | 93.4 | 94.2 |
AttFuse [ICRA'22] | 81.2 | 93.1 | 93.8 |
Where2comm [NeurIPS'22] | 80.7 | 92.2 | 92.9 |
When2com [CVPR'20] | 75.6 | 89.5 | 90.1 |
Who2com [ICRA'20] | 75.6 | 89.5 | 90.1 |
When2com [CVPR'20] | 71.0 | 87.8 | 89.0 |
Who2com [ICRA'20] | 66.9 | 86.0 | 87.3 |
=== | === | === | === |
Early Fusion (Upper Bound) Cooper [ICDCS'19] | 85.0 | 94.6 | 95.4 |
Late Fusion | 76.2 | 90.9 | 91.8 |
No Fusion (Lower Bound) | 65.1 | 87.9 | 89.8 |
- OPV2V Culver
Method | [email protected] | [email protected] | [email protected] |
---|---|---|---|
V2VNet [ECCV'20] | 🏆75.8🌟 | 🏆88.0🌟 | 🏆89.5🌟 |
DiscoNet [NeurIPS'21] | 🏆73.7⭐ | 🏆87.2⭐ | 🏆88.7⭐ |
FuseBEVT [CoRL'22] | 73.2 | 85.7 | 87.3 |
AttFuse [ICRA'22] | 72.8 | 87.0 | 88.4 |
AdaFusion [WACV'23] | 72.7 | 86.6 | 88.1 |
Where2comm [NeurIPS'22] | 72.3 | 86.8 | 88.2 |
Where2comm [NeurIPS'22] | 71.5 | 86.5 | 88.0 |
F-Cooper [SEC'19] | 70.8 | 86.9 | 🏆88.7⭐ |
V2X-ViT [ECCV'22] | 70.2 | 86.4 | 88.6 |
When2com [CVPR'20] | 60.6 | 80.4 | 82.3 |
Who2com [ICRA'20] | 60.6 | 80.4 | 82.3 |
When2com [CVPR'20] | 58.7 | 79.1 | 81.5 |
Who2com [ICRA'20] | 51.6 | 75.5 | 79.0 |
=== | === | === | === |
Early Fusion (Upper Bound) Cooper [ICDCS'19] | 73.5 | 88.2 | 89.8 |
Late Fusion | 64.9 | 86.4 | 89.5 |
No Fusion (Lower Bound) | 57.2 | 79.7 | 83.4 |
- V2XSet Ideal
Method | [email protected] | [email protected] | [email protected] |
---|---|---|---|
V2VNet [ECCV'20] | 🏆80.3🌟 | 🏆92.0⭐ | 🏆93.0⭐ |
DiscoNet [NeurIPS'21] | 🏆78.9⭐ | 🏆92.0⭐ | 92.9 |
AdaFusion [WACV'23] | 78.6 | 🏆92.1🌟 | 92.9 |
FuseBEVT [CoRL'22] | 78.5 | 90.8 | 91.8 |
Where2comm [NeurIPS'22] | 78.0 | 91.6 | 92.4 |
AttFuse [ICRA'22] | 77.1 | 91.0 | 91.9 |
V2X-ViT [ECCV'22] | 76.3 | 🏆92.1🌟 | 🏆93.3🌟 |
Where2comm [NeurIPS'22] | 76.0 | 90.1 | 91.0 |
F-Cooper [SEC'19] | 75.8 | 91.4 | 92.6 |
When2com [CVPR'20] | 67.9 | 86.4 | 87.5 |
Who2com [ICRA'20] | 67.9 | 86.4 | 87.5 |
When2com [CVPR'20] | 61.1 | 83.0 | 84.9 |
Who2com [ICRA'20] | 60.4 | 81.8 | 83.8 |
=== | === | === | === |
Early Fusion (Upper Bound) Cooper [ICDCS'19] | 80.1 | 93.1 | 94.0 |
Late Fusion | 67.4 | 87.2 | 89.3 |
No Fusion (Lower Bound) | 57.9 | 83.5 | 86.6 |
- V2XSet Noisy
Method | [email protected] | [email protected] | [email protected] |
---|---|---|---|
V2VNet [ECCV'20] | 🏆57.0🌟 | 🏆88.7🌟 | 🏆92.7🌟 |
AttFuse [ICRA'22] | 🏆53.4⭐ | 86.3 | 90.2 |
V2X-ViT [ECCV'22] | 53.2 | 88.0 | 🏆92.6⭐ |
DiscoNet [NeurIPS'21] | 52.7 | 🏆88.2⭐ | 92.1 |
Where2comm [NeurIPS'22] | 52.7 | 87.4 | 91.0 |
Where2comm [NeurIPS'22] | 51.3 | 85.9 | 89.7 |
AdaFusion [WACV'23] | 51.2 | 87.8 | 92.1 |
FuseBEVT [CoRL'22] | 51.1 | 85.9 | 89.8 |
F-Cooper [SEC'19] | 50.4 | 86.5 | 90.8 |
When2com [CVPR'20] | 48.2 | 81.4 | 85.2 |
Who2com [ICRA'20] | 48.2 | 81.4 | 85.2 |
When2com [CVPR'20] | 41.9 | 77.7 | 83.3 |
Who2com [ICRA'20] | 37.2 | 75.8 | 82.2 |
=== | === | === | === |
Early Fusion (Upper Bound) Cooper [ICDCS'19] | 51.4 | 90.1 | 93.8 |
Late Fusion | 40.3 | 77.2 | 86.4 |
No Fusion (Lower Bound) | 57.9 | 83.5 | 86.6 |
- Joint Set
Method | [email protected] | [email protected] | [email protected] |
---|---|---|---|
V2VNet [ECCV'20] | 🏆81.6🌟 | 🏆92.5🌟 | 🏆93.4🌟 |
AdaFusion [WACV'23] | 🏆80.2⭐ | 91.6 | 92.5 |
DiscoNet [NeurIPS'21] | 80.0 | 91.6 | 92.6 |
Where2comm [NeurIPS'22] | 79.9 | 91.3 | 92.2 |
FuseBEVT [CoRL'22] | 79.8 | 90.9 | 91.9 |
AttFuse [ICRA'22] | 78.9 | 91.0 | 91.9 |
Where2comm [NeurIPS'22] | 78.5 | 90.1 | 91.1 |
V2X-ViT [ECCV'22] | 78.1 | 🏆92.1⭐ | 🏆93.4🌟 |
F-Cooper [SEC'19] | 78.1 | 91.7 | 🏆92.8⭐ |
When2com [CVPR'20] | 69.7 | 86.1 | 87.2 |
Who2com [ICRA'20] | 69.7 | 86.1 | 87.2 |
When2com [CVPR'20] | 64.1 | 84.3 | 85.9 |
Who2com [ICRA'20] | 60.9 | 81.8 | 83.7 |
=== | === | === | === |
Early Fusion (Upper Bound) Cooper [ICDCS'19] | 82.1 | 93.2 | 94.2 |
Late Fusion | 73.8 | 89.6 | 91.2 |
No Fusion (Lower Bound) | 62.8 | 84.4 | 86.8 |
- In Joint Set evaluation, the OPV2V test split (16 scenes), OPV2V test culver city split (4 scenes), OPV2V validation split (9 scenes), V2XSet test split (19 scenes) and V2XSet validation split (6 scenes) are combined together as a much larger evaluation dataset (totaling 54 different scenes) to allow more stable ranking. The evaluated models are trained on a joint set of OPV2V train split and V2XSet train split with ego vehicle shuffling to augment the data.
- By default, the message is broadcasted to all agents to form a fully connected communication graph. Considering collaboration efficiency and bandwidth constraint, Who2com, When2com and Where2comm further apply different strategies to prune the fully connected communication graph into a partially connected one during inference. Both fully connected mode and partially connected mode are evaluated here and the latter is marked in italic.
- For fair comparison, all methods adopt the identical one-stage training settings in ideal scenarios (i.e., no pose error or time delay) without weight fine-tuning and message compression, extra fusion modules (e.g., down-sampling convolution layers) of intermediate collaboration mode are simplified if not necessary to mitigate the concern about the actual performance gain. PointPillar is adopted as the backbone for all reproduced methods.
- Though the reproduction process is simple and quick (the whole round takes less than 2 days with only two 3090 GPUs), multiple advanced training strategies are applied, which may boost some performance and make the ranking not aligned with the original reports. The reproduction is just a straightforward and fair evaluation for representative collaborative perception methods. To know how the official results are obtained, please refer to the papers or codes collected below for more details, which could be helpful.
- Note: {Real} denotes that the sensor data is obtained by real-world collection instead of simulation.
- DeepAccident (DeepAccident: A Motion and Accident Prediction Benchmark for V2X Autonomous Driving) [paper] [
code] [project]
- CoPerception-UAVs+ (Collaboration Helps Camera Overtake LiDAR in 3D Detection) [paper] [code] [project]
- OPV2V+ (Collaboration Helps Camera Overtake LiDAR in 3D Detection) [paper] [code] [project]
- {Real} V2V4Real (V2V4Real: A Large-Scale Real-World Dataset for Vehicle-to-Vehicle Cooperative Perception) [paper] [code] [project]
- {Real} V2X-Seq (V2X-Seq: The Large-Scale Sequential Dataset for the Vehicle-Infrastructure Cooperative Perception and Forecasting) [
paper] [code] [project]
- {Real} DAIR-V2X-C Complemented (Robust Collaborative 3D Object Detection in Presence of Pose Errors) [paper] [code] [project]
- RLS (Analyzing Infrastructure LiDAR Placement with Realistic LiDAR Simulation Library) [paper] [code] [
project] - V2XP-ASG (V2XP-ASG: Generating Adversarial Scenes for Vehicle-to-Everything Perception) [paper] [code] [
project]
- AutoCastSim (COOPERNAUT: End-to-End Driving with Cooperative Perception for Networked Vehicles) [paper] [code] [project]
- {Real} DAIR-V2X (DAIR-V2X: A Large-Scale Dataset for Vehicle-Infrastructure Cooperative 3D Object Detection) [paper] [code] [project]
- CoPerception-UAVs (Where2comm: Efficient Collaborative Perception via Spatial Confidence Maps) [paper&review] [code] [project]
- V2XSet (V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision Transformer) [paper] [code] [project]
- OPV2V (OPV2V: An Open Benchmark Dataset and Fusion Pipeline for Perception with Vehicle-to-Vehicle Communication) [paper] [code] [project]
- DOLPHINS (DOLPHINS: Dataset for Collaborative Perception Enabled Harmonious and Interconnected Self-Driving) [paper] [code] [project]
- V2X-Sim (V2X-Sim: Multi-Agent Collaborative Perception Dataset and Benchmark for Autonomous Driving) [paper] [code] [project]
- Note: {Related} denotes that it is not a pure collaborative perception paper but has related content.
- {Related} CBR (Calibration-free BEV Representation for Infrastructure Perception) [paper] [
code]- Mode: No Collaboration (only infrastructure data)
- Dataset: DAIR-V2X
- Task: Detection
- Input: RGB Image
- FFNet (Vehicle-Infrastructure Cooperative 3D Object Detection via Feature Flow Prediction) [paper] [code]
- Mode: Intermediate Collaboration
- Dataset: DAIR-V2X
- Task: Detection
- Input: Point Cloud
- MOT-CUP (Collaborative Multi-Object Tracking with Conformal Uncertainty Propagation) [paper] [
code]- Mode: Early Collaboration, Intermediate Collaboration
- Dataset: V2X-Sim
- Task: Tracking
- Input: Point Cloud
- ROBOSAC (Among Us: Adversarially Robust Collaborative Perception by Consensus) [paper] [code]
- Mode: Intermediate Collaboration
- Dataset: V2X-Sim
- Task: Detection
- Input: Point Cloud
- UMC (UMC: A Unified Bandwidth-efficient and Multi-resolution based Collaborative Perception Framework) [paper] [
code]- Mode: Intermediate Collaboration
- Dataset: OPV2V, V2X-Sim
- Task: Detection
- Input: Point Cloud
- VIMI (VIMI: Vehicle-Infrastructure Multi-view Intermediate Fusion for Camera-based 3D Object Detection) [paper] [code]
- Mode: Intermediate Collaboration
- Dataset: DAIR-V2X
- Task: Detection
- Input: RGB Image
- V2VLC (Learning for Vehicle-to-Vehicle Cooperative Perception under Lossy Communication) [paper] [code]
- Mode: Intermediate Collaboration
- Dataset: OPV2V
- Task: Detection
- Input: Point Cloud
- V2XFormer (DeepAccident: A Motion and Accident Prediction Benchmark for V2X Autonomous Driving) [paper] [
code]- Mode: Intermediate Collaboration
- Dataset: DeepAccident
- Task: Detection, Forecasting
- Input: RGB Image
- {Related} BEVHeight (BEVHeight: A Robust Framework for Vision-based Roadside 3D Object Detection) [paper] [code]
- Mode: No Collaboration (only infrastructure data)
- Dataset: DAIR-V2X, V2X-Sim
- Task: Detection
- Input: RGB Image
- CoCa3D (Collaboration Helps Camera Overtake LiDAR in 3D Detection) [paper] [code]
- Mode: Intermediate Collaboration
- Dataset: OPV2V+, DAIR-V2X, CoPerception-UAVs+
- Task: Detection
- Input: RGB Image
- FF-Tracking (V2X-Seq: The Large-Scale Sequential Dataset for the Vehicle-Infrastructure Cooperative Perception and Forecasting) [
paper] [code]- Mode: Intermediate Collaboration
- Dataset: V2X-Seq
- Task: Tracking
- Input: Point Cloud
- {Related} CO3 (CO3: Cooperative Unsupervised 3D Representation Learning for Autonomous Driving) [paper&review] [code]
- Mode: Early Collaboration (for contrastive learning)
- Dataset: DAIR-V2X
- Task: Representation Learning
- Input: Point Cloud
- AdaFusion (Adaptive Feature Fusion for Cooperative Perception Using LiDAR Point Clouds) [paper] [code]
- Mode: Intermediate Collaboration
- Dataset: OPV2V, CODD
- Task: Detection
- Input: Point Cloud
- CoAlign (Robust Collaborative 3D Object Detection in Presence of Pose Errors) [paper] [code]
- Mode: Intermediate Collaboration, Late Collaboration
- Dataset: OPV2V, V2X-Sim, DAIR-V2X
- Task: Detection
- Input: Point Cloud
- {Related} DMGM (Deep Masked Graph Matching for Correspondence Identification in Collaborative Perception) [paper] [code]
- Mode: Intermediate Collaboration
- Dataset: CAD
- Task: Correspondence Identification
- Input: RGBD Image
- Double-M Quantification (Uncertainty Quantification of Collaborative Detection for Self-Driving) [paper] [code]
- Mode: Early Collaboration, Intermediate Collaboration
- Dataset: V2X-Sim
- Task: Detection
- Input: Point Cloud
- MAMP (Model-Agnostic Multi-Agent Perception Framework) [paper] [code]
- Mode: Late Collaboration
- Dataset: OPV2V
- Task: Detection
- Input: Point Cloud
- MATE (Communication-Critical Planning via Multi-Agent Trajectory Exchange) [paper] [
code]- Mode: Late Collaboration
- Dataset: AutoCastSim (simulator), CoBEV-Sim (simulator)
- Task: Planning
- Input: Point Cloud
- MPDA (Bridging the Domain Gap for Multi-Agent Perception) [paper] [code]
- Mode: Intermediate Collaboration
- Dataset: V2XSet
- Task: Detection
- Input: Point Cloud
- Coopernaut (COOPERNAUT: End-to-End Driving with Cooperative Perception for Networked Vehicles) [paper] [code]
- Mode: Intermediate Collaboration
- Dataset: AutoCastSim (simulator)
- Task: Planning
- Input: Point Cloud
- {Related} LAV (Learning from All Vehicles) [paper] [code]
- Mode: Late Collaboration (for training)
- Dataset: CARLA (simulator)
- Task: Planning, Detection (auxiliary supervision), Segmentation (auxiliary supervision)
- Input: RGB Image, Point Cloud
- TCLF (DAIR-V2X: A Large-Scale Dataset for Vehicle-Infrastructure Cooperative 3D Object Detection) [paper] [code]
- Mode: Late Collaboration
- Dataset: DAIR-V2X
- Task: Detection
- Input: RGB Image, Point Cloud
- Where2comm (Where2comm: Efficient Collaborative Perception via Spatial Confidence Maps) [paper&review] [code]
- Mode: Intermediate Collaboration
- Dataset: OPV2V, V2X-Sim, DAIR-V2X, CoPerception-UAVs
- Task: Detection
- Input: Point Cloud
- SyncNet (Latency-Aware Collaborative Perception) [paper] [code]
- Mode: Intermediate Collaboration
- Dataset: V2X-Sim
- Task: Detection
- Input: Point Cloud
- V2X-ViT (V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision Transformer) [paper] [code]
- Mode: Intermediate Collaboration
- Dataset: V2XSet
- Task: Detection
- Input: Point Cloud
- CoBEVT (CoBEVT: Cooperative Bird's Eye View Semantic Segmentation with Sparse Transformers) [paper&review] [code]
- Mode: Intermediate Collaboration
- Dataset: OPV2V, nuScenes
- Task: Segmentation, Detection
- Input: RGB Image, Point Cloud
- STAR (Multi-Robot Scene Completion: Towards Task-Agnostic Collaborative Perception) [paper&review] [code]
- Mode: Intermediate Collaboration
- Dataset: V2X-Sim
- Task: Segmentation, Detection
- Input: Point Cloud
- IA-RCP (Robust Collaborative Perception against Communication Interruption) [paper] [
code]- Mode: Intermediate Collaboration
- Dataset: V2X-Sim
- Task: Detection
- Input: Point Cloud
- CRCNet (Complementarity-Enhanced and Redundancy-Minimized Collaboration Network for Multi-agent Perception) [paper] [
code]- Mode: Intermediate Collaboration
- Dataset: V2X-Sim
- Task: Detection
- Input: Point Cloud
- AttFuse (OPV2V: An Open Benchmark Dataset and Fusion Pipeline for Perception with Vehicle-to-Vehicle Communication) [paper] [code]
- Mode: Intermediate Collaboration
- Dataset: OPV2V
- Task: Detection
- Input: Point Cloud
- MP-Pose (Multi-Robot Collaborative Perception with Graph Neural Networks) [paper] [
code]- Mode: Intermediate Collaboration
- Dataset: AirSim-MAP
- Task: Segmentation
- Input: RGB Image
- DiscoNet (Learning Distilled Collaboration Graph for Multi-Agent Perception) [paper&review] [code]
- Mode: Early Collaboration (teacher model), Intermediate Collaboration (student model)
- Dataset: V2X-Sim
- Task: Detection
- Input: Point Cloud
- Adversarial V2V (Adversarial Attacks On Multi-Agent Communication) [paper] [
code]- Mode: Intermediate Collaboration
- Dataset: V2V-Sim (not publicly available)
- Task: Adversarial Attack
- Input: Point Cloud
- MASH (Overcoming Obstructions via Bandwidth-Limited Multi-Agent Spatial Handshaking) [paper] [code]
- Mode: Late Collaboration
- Dataset: AirSim (simulator)
- Task: Segmentation
- Input: RGB Image
- When2com (When2com: Multi-Agent Perception via Communication Graph Grouping) [paper] [code]
- Mode: Intermediate Collaboration
- Dataset: AirSim-MAP
- Task: Segmentation, Classification
- Input: RGB Image
- V2VNet (V2VNet: Vehicle-to-Vehicle Communication for Joint Perception and Prediction) [paper] [code]
- Mode: Intermediate Collaboration
- Dataset: V2V-Sim (not publicly available)
- Task: Detection, Forecasting
- Input: Point Cloud