Retrieval-Augmented Guardrails for AI-Drafted Patient-Portal Messages: Error Taxonomy Construction and Large-Scale Evaluation

AI in healthcare
Published: arXiv: 2509.22565v1
Authors

Wenyuan Chen Fateme Nateghi Haredasht Kameron C. Black Francois Grolleau Emily Alsentzer Jonathan H. Chen Stephen P. Ma

Abstract

Asynchronous patient-clinician messaging via EHR portals is a growing source of clinician workload, prompting interest in large language models (LLMs) to assist with draft responses. However, LLM outputs may contain clinical inaccuracies, omissions, or tone mismatches, making robust evaluation essential. Our contributions are threefold: (1) we introduce a clinically grounded error ontology comprising 5 domains and 59 granular error codes, developed through inductive coding and expert adjudication; (2) we develop a retrieval-augmented evaluation pipeline (RAEC) that leverages semantically similar historical message-response pairs to improve judgment quality; and (3) we provide a two-stage prompting architecture using DSPy to enable scalable, interpretable, and hierarchical error detection. Our approach assesses the quality of drafts both in isolation and with reference to similar past message-response pairs retrieved from institutional archives. Using a two-stage DSPy pipeline, we compared baseline and reference-enhanced evaluations on over 1,500 patient messages. Retrieval context improved error identification in domains such as clinical completeness and workflow appropriateness. Human validation on 100 messages demonstrated superior agreement (concordance = 50% vs. 33%) and performance (F1 = 0.500 vs. 0.256) of context-enhanced labels vs. baseline, supporting the use of our RAEC pipeline as AI guardrails for patient messaging.

Paper Summary

Problem
Healthcare providers are facing a significant challenge in managing the growing volume of patient messages through secure portals. Despite the benefits of asynchronous communication, staffing has not kept pace, leading to delayed replies, clinician burnout, and safety risks. The use of large language models (LLMs) to draft replies for clinician review has shown promise, but current monitoring approaches are not scalable and cannot prevent real-time patient harm.
Key Innovation
The researchers developed a real-time, multi-agent framework called Retrieval-Augmented Error Checking (RAEC) to evaluate and explain potential errors in LLM-generated patient messages before they reach clinicians or patients. RAEC combines three core innovations: 1. A comprehensive, clinician-vetted error ontology to identify clinically consequential errors. 2. Retrieval of local historical message context to personalize error detection. 3. A team of agentic LLM evaluators that classify and justify errors at inference time.
Practical Impact
The RAEC framework has the potential to mitigate clinical risk in AI-assisted messaging by systematically producing contextually grounded judgments aligned with clinician expertise. This can enhance patient safety while alleviating the growing burden of asynchronous communication. The framework can be applied in real-world settings to improve the accuracy and specificity of error detection, ultimately reducing the risk of clinically consequential errors.
Analogy / Intuitive Explanation
Imagine a system that acts as a "digital editor" for AI-generated patient messages. Just as a human editor reviews a draft to catch errors and suggest improvements, RAEC uses a combination of local context retrieval and LLM agents to evaluate and explain potential errors in LLM-generated patient messages. This approach ensures that clinically consequential errors are caught and addressed in real-time, improving patient safety and reducing the burden of asynchronous communication.
Paper Information
Categories:
cs.CL cs.AI cs.IR
Published Date:

arXiv ID:

2509.22565v1

Quick Actions