Preventing Shortcut Learning in Medical Image Analysis through Intermediate Layer Knowledge Distillation from Specialist Teachers

Generative AI & LLMs
Published: arXiv: 2511.17421v1
Authors

Christopher Boland Sotirios Tsaftaris Sonia Dahdouh

Abstract

Deep learning models are prone to learning shortcut solutions to problems using spuriously correlated yet irrelevant features of their training data. In high-risk applications such as medical image analysis, this phenomenon may prevent models from using clinically meaningful features when making predictions, potentially leading to poor robustness and harm to patients. We demonstrate that different types of shortcuts (those that are diffuse and spread throughout the image, as well as those that are localized to specific areas) manifest distinctly across network layers and can, therefore, be more effectively targeted through mitigation strategies that target the intermediate layers. We propose a novel knowledge distillation framework that leverages a teacher network fine-tuned on a small subset of task-relevant data to mitigate shortcut learning in a student network trained on a large dataset corrupted with a bias feature. Through extensive experiments on CheXpert, ISIC 2017, and SimBA datasets using various architectures (ResNet-18, AlexNet, DenseNet-121, and 3D CNNs), we demonstrate consistent improvements over traditional Empirical Risk Minimization, augmentation-based bias-mitigation, and group-based bias-mitigation approaches. In many cases, we achieve comparable performance with a baseline model trained on bias-free data, even on out-of-distribution test data. Our results demonstrate the practical applicability of our approach to real-world medical imaging scenarios where bias annotations are limited and shortcut features are difficult to identify a priori.

Paper Summary

Problem
Deep learning models, such as those used in medical image analysis, can learn shortcuts or biased solutions to problems instead of using clinically meaningful features. This can lead to poor robustness and harm to patients. The main challenge is to prevent shortcut learning and improve the generalization of these models.
Key Innovation
The researchers propose a novel knowledge distillation framework that leverages a teacher network fine-tuned on a small subset of task-relevant data to mitigate shortcut learning in a student network trained on a large dataset corrupted with a bias feature. This approach is unique because it targets the intermediate layers of the network, which are more effective in reducing bias than final-layer distillation alone.
Practical Impact
This research has significant practical implications for medical image analysis, where bias annotations are limited and shortcut features are difficult to identify a priori. The proposed approach can be applied to real-world medical imaging scenarios to improve the performance and robustness of deep learning models. By reducing bias and improving generalization, this approach can lead to better patient outcomes and more accurate diagnoses.
Analogy / Intuitive Explanation
Think of a deep learning model as a student trying to learn a new language. The student might learn to recognize shortcuts, such as relying on a single word to understand the entire sentence, instead of learning to understand the nuances of the language. The proposed approach is like providing the student with a teacher who has already learned the language and can guide the student to focus on the meaningful features, rather than relying on shortcuts. This helps the student to learn a more accurate and robust understanding of the language, just like the teacher.
Paper Information
Categories:
cs.CV cs.AI
Published Date:

arXiv ID:

2511.17421v1

Quick Actions