Causally-Guided Pairwise Transformer -- Towards Foundational Digital Twins in Process Industry

Generative AI & LLMs
Published: arXiv: 2508.13111v1
Authors

Michael Mayr Georgios C. Chasparis

Abstract

Foundational modelling of multi-dimensional time-series data in industrial systems presents a central trade-off: channel-dependent (CD) models capture specific cross-variable dynamics but lack robustness and adaptability as model layers are commonly bound to the data dimensionality of the tackled use-case, while channel-independent (CI) models offer generality at the cost of modelling the explicit interactions crucial for system-level predictive regression tasks. To resolve this, we propose the Causally-Guided Pairwise Transformer (CGPT), a novel architecture that integrates a known causal graph as an inductive bias. The core of CGPT is built around a pairwise modeling paradigm, tackling the CD/CI conflict by decomposing the multidimensional data into pairs. The model uses channel-agnostic learnable layers where all parameter dimensions are independent of the number of variables. CGPT enforces a CD information flow at the pair-level and CI-like generalization across pairs. This approach disentangles complex system dynamics and results in a highly flexible architecture that ensures scalability and any-variate adaptability. We validate CGPT on a suite of synthetic and real-world industrial datasets on long-term and one-step forecasting tasks designed to simulate common industrial complexities. Results demonstrate that CGPT significantly outperforms both CI and CD baselines in predictive accuracy and shows competitive performance with end-to-end trained CD models while remaining agnostic to the problem dimensionality.

Paper Summary

Problem
The European process industry is facing increasing pressures from economic competition and regulatory demands, particularly concerning energy efficiency and greenhouse gas emission reduction targets. To maintain global competitiveness, staying on top of industrial and scientific advancements is a necessity. The growth of retrofitted sensors across various sectors has led to an explosion in data volume, offering opportunities to leverage complex information for enhanced operational efficiency and decision-making.
Key Innovation
The Causally-Guided Pairwise Transformer (CGPT) is a novel architecture that integrates a known causal graph as an inductive bias. This approach tackles the CD/CI conflict by decomposing multidimensional data into pairs, using channel-agnostic learnable layers where all parameter dimensions are independent of the number of variables.
Practical Impact
The CGPT architecture ensures scalability and any-variate adaptability, making it a significant step towards a versatile, "one-for-all" predictive model for the process industry. By handling arbitrary sensor configurations without architectural changes, CGPT excels at long-term forecasting by leveraging causal drivers, outperforming both channel-independent and channel-dependent baselines.
Analogy / Intuitive Explanation
Imagine trying to understand a complex system by looking at individual components in isolation. This is like trying to model industrial processes using channel-independent models. However, these models lack the ability to capture specific cross-variable dynamics that are crucial for predicting real-world outcomes. The CGPT architecture is like a "systemic thinking" approach, where you break down the complex system into smaller pairs of variables and then use those pairs to understand how they interact with each other. This allows the model to capture both channel-dependent interactions and channel-independent generalization, making it a powerful tool for predicting industrial outcomes.
Paper Information
Categories:
cs.LG
Published Date:

arXiv ID:

2508.13111v1

Quick Actions