Context-specific Credibility-aware Multimodal Fusion with Conditional Probabilistic Circuits

Generative AI & LLMs
Published: arXiv: 2603.26629v1
Authors

Pranuthi Tenali Sahil Sidheekh Saurabh Mathur Erik Blasch Kristian Kersting Sriraam Natarajan

Abstract

Multimodal fusion requires integrating information from multiple sources that may conflict depending on context. Existing fusion approaches typically rely on static assumptions about source reliability, limiting their ability to resolve conflicts when a modality becomes unreliable due to situational factors such as sensor degradation or class-specific corruption. We introduce C$^2$MF, a context-specfic credibility-aware multimodal fusion framework that models per-instance source reliability using a Conditional Probabilistic Circuit (CPC). We formalize instance-level reliability through Context-Specific Information Credibility (CSIC), a KL-divergence-based measure computed exactly from the CPC. CSIC generalizes conventional static credibility estimates as a special case, enabling principled and adaptive reliability assessment. To evaluate robustness under cross-modal conflicts, we propose the Conflict benchmark, in which class-specific corruptions deliberately induce discrepancies between different modalities. Experimental results show that C$^2$MF improves predictive accuracy by up to 29% over static-reliability baselines in high-noise settings, while preserving the interpretability advantages of probabilistic circuit-based fusion.

Paper Summary

Problem
The main problem addressed in this paper is the challenge of multimodal fusion in real-world environments. Multimodal fusion involves integrating information from multiple sources, such as images, audio, and text, to make decisions. However, these sources can provide conflicting information, and the reliability of each source can depend on the context. This makes it difficult to determine which source to trust, and existing fusion approaches often rely on static assumptions about source reliability, which can break down in real-world settings.
Key Innovation
The key innovation in this paper is the introduction of C2MF, a context-specific credibility-aware multimodal fusion framework that models per-instance source reliability using a Conditional Probabilistic Circuit (CPC). C2MF dynamically evaluates the credibility of each source based on its position in a learned latent context, enabling dynamic instance-level reliability modeling while preserving exact probabilistic semantics. This approach generalizes conventional static credibility estimates as a special case, enabling principled and adaptive reliability assessment.
Practical Impact
This research has significant practical implications for real-world applications, such as autonomous navigation, industrial robotics, and medical decision support. By improving predictive accuracy by up to 29% over static-reliability baselines in high-noise settings, C2MF can provide more reliable and accurate decisions in these critical domains. Additionally, the Context-Specific Information Credibility (CSIC) metric provides a mathematically grounded audit trail, enabling an exact calculation of each modality's influence on a per-instance basis, which is particularly important in high-stakes domains.
Analogy / Intuitive Explanation
Imagine you're trying to decide what to wear based on the weather forecast. You have multiple sources of information, such as a high-resolution camera that shows a clear picture of the sky, a microphone that picks up the sound of raindrops, and a text message from a friend who says it's sunny. In this case, the camera and microphone provide conflicting information, and the text message is more reliable. C2MF is like a decision-making system that takes into account the credibility of each source based on the context and makes a decision accordingly. It's like a "weather forecast" for multimodal fusion, providing a more accurate and reliable decision-making process.
Paper Information
Categories:
cs.LG
Published Date:

arXiv ID:

2603.26629v1

Quick Actions