Leveraging Imperfection with MEDLEY A Multi-Model Approach Harnessing Bias in Medical AI

AI in healthcare
Published: arXiv: 2508.21648v1
Authors

Farhad Abtahi Mehdi Astaraki Fernando Seoane

Abstract

Bias in medical artificial intelligence is conventionally viewed as a defect requiring elimination. However, human reasoning inherently incorporates biases shaped by education, culture, and experience, suggesting their presence may be inevitable and potentially valuable. We propose MEDLEY (Medical Ensemble Diagnostic system with Leveraged diversitY), a conceptual framework that orchestrates multiple AI models while preserving their diverse outputs rather than collapsing them into a consensus. Unlike traditional approaches that suppress disagreement, MEDLEY documents model-specific biases as potential strengths and treats hallucinations as provisional hypotheses for clinician verification. A proof-of-concept demonstrator was developed using over 30 large language models, creating a minimum viable product that preserved both consensus and minority views in synthetic cases, making diagnostic uncertainty and latent biases transparent for clinical oversight. While not yet a validated clinical tool, the demonstration illustrates how structured diversity can enhance medical reasoning under clinician supervision. By reframing AI imperfection as a resource, MEDLEY offers a paradigm shift that opens new regulatory, ethical, and innovation pathways for developing trustworthy medical AI systems.

Paper Summary

Problem
The main problem addressed by this research paper is the issue of bias in medical artificial intelligence (AI). Current approaches aim to eliminate bias, but human reasoning inherently incorporates biases shaped by education, culture, and experience. This suggests that bias may be inevitable and potentially valuable. The paper also highlights the limitations of large language models (LLMs) in clinical contexts, including hallucinations (ungrounded outputs) and the "black-box" nature of deep learning systems, which complicates accountability and trust.
Key Innovation
The key innovation of this work is MEDLEY (Medical Ensemble Diagnostic system with Leveraged diversitY), a conceptual framework that orchestrates multiple AI models while preserving their diverse outputs. Unlike traditional approaches that suppress disagreement, MEDLEY documents model-specific biases as potential strengths and treats hallucinations as provisional hypotheses for clinician verification. This approach reframes AI imperfection as a resource, rather than a liability.
Practical Impact
This research has the potential to revolutionize the development of trustworthy medical AI systems. By preserving diagnostic plurality and making bias visible, MEDLEY offers a paradigm shift that opens new regulatory, ethical, and innovation pathways. The framework could be applied in various clinical domains, such as imaging, diagnostics, and workflow support, to enhance medical reasoning under clinician supervision. This could lead to more accurate diagnoses, improved patient outcomes, and increased trust in AI systems.
Analogy / Intuitive Explanation
Imagine a multidisciplinary tumour board in clinical practice, where experts from different fields come together to discuss and analyze a patient's case. Each expert brings their unique perspective and experience, which can lead to a more comprehensive understanding of the patient's condition. Similarly, MEDLEY orchestrates multiple AI models, each with its own biases and strengths, to generate a more complete picture of a patient's diagnosis. By preserving the diversity of these models, MEDLEY creates a framework for clinicians to verify and validate the outputs, leading to more accurate and trustworthy diagnoses.
Paper Information
Categories:
cs.AI 68T07, 68T09, 68T20 (Primary) 62P10, 62C20, 62H30 (Secondary)
Published Date:

arXiv ID:

2508.21648v1

Quick Actions