Multi-agent collaborative perception (MCP) has recently attracted much attention. It includes three key processes: communication for sharing, collaboration for integration, and reconstruction for different downstream tasks. Existing methods pursue designing the collaboration process alone, ignoring their intrinsic interactions and resulting in suboptimal performance. In contrast, we aim to propose Unified Collaborative perception framework named UMC, optimizing the communication, collaboration, and reconstruction processes with the Multi-resolution technique. The communication introduces a novel trainable multi-resolution and selective-region (MRSR) mechanism, achieving higher quality and lower bandwidth. Then, a graph-based collaboration is proposed, conducting on each resolution to adapt the MRSR. Finally, the reconstruction integrates the multi-resolution collaborative features for downstream tasks. Since the general metric can not reflect the performance enhancement brought by MCP systematically, we introduce a brand-new evaluation metric that evaluates the MCP from different perspectives. To verify our algorithm, we conducted experiments on the V2X-Sim and OPV2V datasets. Our quantitative and qualitative experiments prove that the proposed UMC outperforms the state-of-the-art collaborative perception approaches.
The overview of the proposed UMC framework. i) Feature extraction stage, the agents obtain the \({F}^{e,t}_{i}\) by the shared feature encoder \(\Theta^{e}\) with the observation \(x^{t}_{i}\). ii) Communication stage, the ego agent (in blue) will broadcast compact query matrix \({M}^{e,t}_{i}\) in each resolution to collaborators by V2X communication, and the collaborators will compute the transmission map by entropy-CS module at local. Then the ego agents will receive the selected messages from the assisted collaborators. iii) Collaboration stage, the ego agents will employ the G-CGRU module in each resolution for high efficient collaboration. iv) Reconstruction stage, the MGFE module will reconstruct the ego agent's feature by multi-resolution collaborative feature maps for different downstream tasks.
@inproceedings{wang2023umc,
title = {UMC: A Unified Bandwidth-efficient and Multi-resolution based Collaborative Perception Framework},
author = {Tianhang, Wang and Guang, Chen and Kai, Chen and Zhengfa, Liu, Bo, Zhang, Alois, Knoll, Changjun, Jiang},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
year = {2023}
}