Learning ECG Representations via Poly-Window Contrastive Learning

AI in healthcare
Published: arXiv: 2508.15225v1
Authors

Yi Yuan Joseph Van Duyn Runze Yan Zhuoyi Huang Sulaiman Vesal Sergey Plis Xiao Hu Gloria Hyunjung Kwak Ran Xiao Alex Fedorov

Abstract

Electrocardiogram (ECG) analysis is foundational for cardiovascular disease diagnosis, yet the performance of deep learning models is often constrained by limited access to annotated data. Self-supervised contrastive learning has emerged as a powerful approach for learning robust ECG representations from unlabeled signals. However, most existing methods generate only pairwise augmented views and fail to leverage the rich temporal structure of ECG recordings. In this work, we present a poly-window contrastive learning framework. We extract multiple temporal windows from each ECG instance to construct positive pairs and maximize their agreement via statistics. Inspired by the principle of slow feature analysis, our approach explicitly encourages the model to learn temporally invariant and physiologically meaningful features that persist across time. We validate our approach through extensive experiments and ablation studies on the PTB-XL dataset. Our results demonstrate that poly-window contrastive learning consistently outperforms conventional two-view methods in multi-label superclass classification, achieving higher AUROC (0.891 vs. 0.888) and F1 scores (0.680 vs. 0.679) while requiring up to four times fewer pre-training epochs (32 vs. 128) and 14.8% in total wall clock pre-training time reduction. Despite processing multiple windows per sample, we achieve a significant reduction in the number of training epochs and total computation time, making our method practical for training foundational models. Through extensive ablations, we identify optimal design choices and demonstrate robustness across various hyperparameters. These findings establish poly-window contrastive learning as a highly efficient and scalable paradigm for automated ECG analysis and provide a promising general framework for self-supervised representation learning in biomedical time-series data.

Paper Summary

Problem
Cardiovascular disease (CVD) is the leading cause of death globally, and accurate electrocardiogram (ECG) analysis is critical for early detection and diagnosis. However, deep learning models that analyze ECG signals are often limited by the lack of annotated data, making it difficult to train accurate models.
Key Innovation
Researchers have developed a new approach called poly-window contrastive learning, which extracts multiple temporal windows from each ECG instance and maximizes their agreement via statistics. This approach encourages the model to learn temporally invariant and physiologically meaningful features that persist across time.
Practical Impact
The poly-window contrastive learning framework has the potential to improve the accuracy and efficiency of ECG analysis, enabling earlier diagnosis and better patient outcomes. By leveraging multiple temporal windows, the model can capture slow, physiologically relevant features that persist across the ECG recording, leading to more accurate classification and reduced training time.
Analogy / Intuitive Explanation
Imagine trying to recognize a person's face from different angles and lighting conditions. Traditional contrastive learning methods are like comparing two photos of the same person, taken from slightly different angles. Poly-window contrastive learning is like comparing multiple photos of the same person, taken from different angles and lighting conditions, to learn a more robust and generalizable representation of the person's face. Similarly, the poly-window contrastive learning framework compares multiple temporal windows from each ECG instance to learn a more accurate and efficient representation of the ECG signal.
Paper Information
Categories:
cs.LG eess.SP
Published Date:

arXiv ID:

2508.15225v1

Quick Actions