Explainability, not just accuracy, crucial for trust in medical AI systems
The study suggests that the absence of explanations undermined trust, prompting human operators to dismiss AI-generated alarms that could have saved lives. This raises concerns about whether performance alone is enough to ensure adoption in high-stakes environments.
Artificial intelligence is storming into healthcare, promising improved diagnoses and tailored treatments, but a pressing question looms: can doctors and patients trust tools they can’t fully understand? A new peer-reviewed study titled “What Is the Role of Explainability in Medical Artificial Intelligence? A Case-Based Approach,” published in Bioengineering, examines the ethical dimensions and practical implications of explainability in AI-powered clinical decision support systems (CDSSs).
More specifically, the study asks: Who needs explainability? What kind matters? How does it shape decisions, trust, and patient relationships? The answers, it turns out, aren’t one-size-fits-all. The paper exposes the tension between technological opacity and ethical accountability, while underlining the potential for explainable AI (xAI) to strengthen decision-making, enhance trust, and safeguard patient rights.
Each use case investigates a unique aspect of AI in medicine. The first explores a machine learning model used to detect out-of-hospital cardiac arrests through emergency calls - a system that outperformed human dispatchers in speed and sensitivity but lacked any form of explainability. Despite its technical prowess, dispatchers often ignored its alerts. The study suggests that the absence of explanations undermined trust, prompting human operators to dismiss AI-generated alarms that could have saved lives. This raises concerns about whether performance alone is enough to ensure adoption in high-stakes environments.
The second case turns to the early diagnosis of Alzheimer’s disease using convolutional neural networks enhanced with LIME, a post hoc explanation method. The model achieved high accuracy, but the feature importance varied across models, creating inconsistencies in which brain regions were identified as relevant for diagnosis. This variance complicated medical validation and potentially misled clinicians, highlighting that not all explanations - especially correlational ones - enhance transparency.
A more promising scenario is the third use case, which involves a hybrid model developed for diagnosing adult ADHD. This system combines machine learning with a knowledge-based rule system derived from clinical guidelines. By offering interpretable if-then logic for each diagnostic outcome, it helps triage patients into categories of diagnostic certainty. Junior clinicians can manage straightforward cases, while more ambiguous situations are referred to senior psychiatrists. Here, explainability not only bolsters trust but also facilitates workforce efficiency and faster diagnosis.
The fourth case focuses on the simulation of ethical decision-making in healthcare through a fuzzy cognitive map (FCM). This tool incorporates patient preferences and biomedical principles such as beneficence and autonomy to suggest treatment pathways. While explainable and transparent, the model raises philosophical concerns. Can machines accurately replicate complex moral reasoning? The study argues that although the system provides clarity in how decisions are formed, its foundational assumptions and limitations make it unsuitable as a replacement for human deliberation, especially in ethics-driven medical dilemmas.
A consistent thread across these cases is the critical role of the end-user, whether clinician, dispatcher, or patient, in interpreting AI outputs. The study identifies seven domains where explainability plays a pivotal role: the identity of the explanation's recipient, medical decision-making quality, the nature of the explanation, the balance between explainability and accuracy, automation bias, the consideration of individual values, and the impact on doctor–patient relationships.
The findings reveal that clinicians are particularly vulnerable to automation bias, especially when AI systems are perceived to hold epistemic authority. When an AI tool delivers results without context or reasoning, clinicians may either over-trust or dismiss the outputs, leading to suboptimal decisions. Conversely, when explanations are available, the risk of blind reliance can also increase if the explanations appear more plausible than they are. This paradox underscores the necessity for rigorous empirical studies that evaluate how clinicians interact with xAI in practice.
Patient autonomy is another domain where explainability proves essential. In scenarios involving sensitive diagnoses or morally complex decisions, patients require understandable justifications for recommendations that impact their lives. Without this, shared decision-making - a cornerstone of modern medical ethics - risks being compromised.
The study also challenges the conventional assumption that accuracy alone justifies the deployment of AI tools in healthcare. In cases where explanations support patient understanding, clinician judgment, and regulatory compliance, they may outweigh marginal gains in predictive precision. The report cites that even highly accurate models may fail to deliver better outcomes if users don’t trust or understand them.
Moreover, the study calls into question the widespread use of black box models in clinical environments. It points out that developers often prioritize performance metrics without adequately accounting for usability or ethical compliance. It recommends integrating explainability into the design process from the outset, rather than appending post hoc interpretability mechanisms as an afterthought.
The study also acknowledges a major limitation - lack of empirical evidence. Most use cases are still in experimental phases, and the impact of explainability on real-world behavior remains poorly understood. The author advocates for future interdisciplinary collaborations that integrate ethical reflection, user feedback, and technical development.
- FIRST PUBLISHED IN:
- Devdiscourse

