SAGE-FM: A lightweight and interpretable spatial transcriptomics foundation model

Explainable & Ethical AI
Published: arXiv: 2601.15504v1
Authors

Xianghao Zhan Jingyu Xu Yuanning Zheng Zinaida Good Olivier Gevaert

Abstract

Spatial transcriptomics enables spatial gene expression profiling, motivating computational models that capture spatially conditioned regulatory relationships. We introduce SAGE-FM, a lightweight spatial transcriptomics foundation model based on graph convolutional networks (GCNs) trained with a masked central spot prediction objective. Trained on 416 human Visium samples spanning 15 organs, SAGE-FM learns spatially coherent embeddings that robustly recover masked genes, with 91% of masked genes showing significant correlations (p < 0.05). The embeddings generated by SAGE-FM outperform MOFA and existing spatial transcriptomics methods in unsupervised clustering and preservation of biological heterogeneity. SAGE-FM generalizes to downstream tasks, enabling 81% accuracy in pathologist-defined spot annotation in oropharyngeal squamous cell carcinoma and improving glioblastoma subtype prediction relative to MOFA. In silico perturbation experiments further demonstrate that the model captures directional ligand-receptor and upstream-downstream regulatory effects consistent with ground truth. These results demonstrate that simple, parameter-efficient GCNs can serve as biologically interpretable and spatially aware foundation models for large-scale spatial transcriptomics.

Paper Summary

Problem
The main problem addressed by this research paper is the challenge of extracting robust, biologically meaningful representations from spatial transcriptomics data. Spatial transcriptomics technologies enable transcriptome-wide gene expression profiling while preserving the spatial architecture of tissues, but integrating spatial coordinates with transcriptomic signals remains technically challenging. This complexity underscores the need for advanced computational methods capable of extracting biologically meaningful representations from ST data.
Key Innovation
The key innovation of this work is the introduction of SAGE-FM, a lightweight spatial transcriptomics foundation model based on graph convolutional networks (GCN) trained with a masked-central-spot prediction objective. SAGE-FM learns spatially coherent embeddings that recover masked genes robustly, outperforming MOFA and spatial transcriptomics in unsupervised clustering and preservation of biological heterogeneity.
Practical Impact
This research has practical implications for various downstream biological tasks, such as cell type annotation, mapping proximity-based interactions, and discovering spatial biomarkers, therapeutic targets, and disease mechanisms. SAGE-FM generalizes to downstream tasks, enabling 81% accuracy in pathologist-defined spot annotation in oropharyngeal squamous cell carcinoma and improving glioblastoma subtype prediction relative to MOFA. This foundation model has the potential to enable generalizable and biologically meaningful representation learning across diverse tissues.
Analogy / Intuitive Explanation
Imagine a map of a city with different neighborhoods, each representing a specific type of cell or tissue. SAGE-FM is like a GPS system that can navigate this map, identifying the location of specific genes and their relationships with other genes in the neighborhood. This allows researchers to extract robust, biologically meaningful representations from spatial transcriptomics data, which can be used to understand cellular organization in health and disease.
Paper Information
Categories:
cs.LG q-bio.GN q-bio.QM
Published Date:

arXiv ID:

2601.15504v1

Quick Actions