Vous êtes ici

Exploration of Deep Learning-based Multimodal Fusion for Semantic Road Scene Segmentation

Affiliation auteurs	!!!! Error affiliation !!!!
Titre	Exploration of Deep Learning-based Multimodal Fusion for Semantic Road Scene Segmentation
Type de publication	Conference Paper
Year of Publication	2019
Auteurs	Zhang Y, Morel O, Blanchon M, Seulin R, Rastgoo M, Sidibe D
Editor	Tremeau A, Farinella GM, Braz J
Conference Name	PROCEEDINGS OF THE 14TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5
Publisher	SCITEPRESS
Conference Location	AV D MANUELL, 27A 2 ESQ, SETUBAL, 2910-595, PORTUGAL
ISBN Number	978-989-758-354-4
Mots-clés	Deep learning, multimodal fusion, Road Scenes, Semantic segmentation
Résumé	Deep neural networks have been frequently used for semantic scene understanding in recent years. Effective and robust segmentation in outdoor scene is prerequisite for safe autonomous navigation of autonomous vehicles. In this paper, our aim is to find the best exploitation of different imaging modalities for road scene segmentation, as opposed to using a single RGB modality. We explore deep learning-based early and later fusion pattern for semantic segmentation, and propose a new multi-level feature fusion network. Given a pair of aligned multimodal images, the network can achieve faster convergence and incorporate more contextual information. In particular, we introduce the first-of-its-kind dataset, which contains aligned raw RGB images and polarimetric images, followed by manually labeled ground truth. The use of polarization cameras is a sensory augmentation that can significantly enhance the capabilities of image understanding, for the detection of highly reflective areas such as glasses and water. Experimental results suggest that our proposed multimodal fusion network outperforms unimodal networks and two typical fusion architectures.
DOI	10.5220/0007360403360343