A central multimodal fusion framework for outdoor scene image segmentation

Affiliation auteurs!!!! Error affiliation !!!!
TitreA central multimodal fusion framework for outdoor scene image segmentation
Type de publicationJournal Article
Year of PublicationSubmitted
AuteursZhang Y, Morel O, Seulin R, Meriaudeau F, Sidibe D
JournalMULTIMEDIA TOOLS AND APPLICATIONS
Type of ArticleArticle; Early Access
ISSN1380-7501
Mots-clésDeep learning, Image Fusion, Multi-modality, Semantic segmentation
Résumé

Robust multimodal fusion is one of the challenging research problems in semantic scene understanding. In real-world applications, the fusion system can overcome the drawbacks of individual sensors by taking different feature representations and statistical properties of multiple modalities (e.g., RGB-depth cameras, multispectral cameras). In this paper, we propose a novel central multimodal fusion framework for semantic image segmentation of road scenes, aiming to effectively learn joint feature representations and optimally combine deep neural networks with statistical priors. More specifically, the proposed fusion framework can automatically generate a central branch by sequentially mapping multimodal features into a common space, including both low-level and high-level features. Besides, in order to reduce the model uncertainty, we employ statistical fusion to compute the final prediction, which leads to significant performance improvement. We conduct extensive experiments on various outdoor scene datasets. Both qualitative and quantitative experiments demonstrate that our central fusion framework achieves competitive performance against existing multimodal fusion methods.

DOI10.1007/s11042-020-10357-y