A central multimodal fusion framework for outdoor scene image segmentation
Affiliation auteurs | !!!! Error affiliation !!!! |
Titre | A central multimodal fusion framework for outdoor scene image segmentation |
Type de publication | Journal Article |
Year of Publication | Submitted |
Auteurs | Zhang Y, Morel O, Seulin R, Meriaudeau F, Sidibe D |
Journal | MULTIMEDIA TOOLS AND APPLICATIONS |
Type of Article | Article; Early Access |
ISSN | 1380-7501 |
Mots-clés | Deep learning, Image Fusion, Multi-modality, Semantic segmentation |
Résumé | Robust multimodal fusion is one of the challenging research problems in semantic scene understanding. In real-world applications, the fusion system can overcome the drawbacks of individual sensors by taking different feature representations and statistical properties of multiple modalities (e.g., RGB-depth cameras, multispectral cameras). In this paper, we propose a novel central multimodal fusion framework for semantic image segmentation of road scenes, aiming to effectively learn joint feature representations and optimally combine deep neural networks with statistical priors. More specifically, the proposed fusion framework can automatically generate a central branch by sequentially mapping multimodal features into a common space, including both low-level and high-level features. Besides, in order to reduce the model uncertainty, we employ statistical fusion to compute the final prediction, which leads to significant performance improvement. We conduct extensive experiments on various outdoor scene datasets. Both qualitative and quantitative experiments demonstrate that our central fusion framework achieves competitive performance against existing multimodal fusion methods. |
DOI | 10.1007/s11042-020-10357-y |