incomplete_multimodal_fusion

Adaptive Multimodal Learning for Remote Sensing Data Fusion

In the realm of remote sensing data fusion, the efficacy of multimodal Transformer networks hinges on their adeptness at integrating diverse signals via self-attention mechanisms. Yet, prevailing methodologies assume access to all modalities throughout both training and inference phases, leading to notable performance degradation when confronted with modal-incomplete inputs in practical scenarios. To surmount this challenge, we present a pioneering approach to incomplete multimodal learning tailored specifically for remote sensing data fusion and the multimodal Transformer architecture.

Our method, adaptable to both supervised and self-supervised pre-training paradigms, introduces novel techniques to seamlessly incorporate incomplete modalities into the learning process. Central to our strategy is the utilization of learned fusion tokens, coupled with modality attention and masked self-attention mechanisms, to efficiently aggregate multimodal signals within the Transformer framework. Crucially, our approach accommodates random modality combinations during network training, enhancing adaptability and robustness.

Employing a fusion-centric training regimen bolstered by reconstruction and contrastive loss functions, our method not only facilitates effective fusion during pre-training but also excels in handling incomplete inputs during inference. Experimental validation on two diverse multimodal datasets underscores the efficacy of our approach, demonstrating state-of-the-art performance across tasks such as building instance/semantic segmentation and land-cover mapping. Notably, our method showcases remarkable resilience in scenarios where incomplete modalities are prevalent, further consolidating its utility in real-world applications of remote sensing data fusion.

By offering a comprehensive solution to the challenge of incomplete multimodal learning, our approach significantly advances the frontier of remote sensing data fusion, opening avenues for enhanced performance and adaptability in diverse operational contexts.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
downstream		downstream
images		images
pretraining		pretraining
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

incomplete_multimodal_fusion

About

Releases

Packages

Languages

Yusin2Chen/incomplete_multimodal_fusion

Folders and files

Latest commit

History

Repository files navigation

incomplete_multimodal_fusion

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages