Overcoming bias to build resilient models: A path to reliable medical AI

DNNs, while highly effective in tasks such as melanoma detection and cardiovascular disease prediction, often rely on irrelevant features correlated with target labels due to biases in training data.


CO-EDP, VisionRICO-EDP, VisionRI | Updated: 29-01-2025 17:10 IST | Created: 29-01-2025 17:10 IST
Overcoming bias to build resilient models: A path to reliable medical AI
Representative Image. Credit: ChatGPT

Artificial Intelligence (AI) is rapidly transforming healthcare, from diagnosing diseases to predicting outcomes. However, the application of deep neural networks (DNNs) in high-stakes medical settings comes with a significant challenge: their susceptibility to learning spurious correlations in data, a phenomenon known as shortcut learning. This issue can lead to unsafe and unreliable predictions, jeopardizing patient outcomes.

Addressing this critical concern, researchers Frederik Pahde, Thomas Wiegand, Sebastian Lapuschkin, and Wojciech Samek, in their study "Ensuring Medical AI Safety: Explainable AI-Driven Detection and Mitigation of Spurious Model Behavior," published by Fraunhofer Heinrich Hertz Institut, propose a novel framework leveraging Explainable AI (XAI) to detect and mitigate these biases.

The problem: Spurious behavior in medical AI

DNNs, while highly effective in tasks such as melanoma detection and cardiovascular disease prediction, often rely on irrelevant features correlated with target labels due to biases in training data. For instance, a model might associate the presence of a band-aid in dermoscopic images with benign lesions, leading to potentially harmful misdiagnoses. Similarly, AI systems trained on radiographs may base predictions on metadata artifacts like hospital identifiers rather than medical features, as seen in pneumonia detection systems.

These biases stem from data artifacts - unintended correlations introduced during dataset curation. Examples include color shifts from medical imaging devices, skin markers, or even patient demographics like age or ethnicity. These issues highlight the pressing need for frameworks capable of identifying and addressing these hidden biases.

The solution: An XAI-driven framework

The study introduces an enhanced version of the Reveal2Revise framework, designed to identify and mitigate spurious correlations in medical AI systems. The proposed framework integrates explainable AI techniques for both data-level and model-level bias detection, enabling semi-automated annotation and refinement processes.

Key Features of the Framework

  • Bias Detection and Representation: The framework identifies biases by analyzing outlier behaviors in both input data and model predictions. It uses Concept Activation Vectors (CAVs) to represent biases as directions in latent space, enabling the detection of spurious correlations without requiring extensive manual labeling.

  • Semi-Automated Annotation: Bias localization at both sample and pixel levels is achieved using local attribution methods like Layer-wise Relevance Propagation (LRP). These methods create heatmaps that highlight areas influenced by biases, reducing the need for labor-intensive manual annotations.

  • Iterative Refinement: The framework employs an iterative process to improve bias representations. By refining initial CAV-based annotations, the system minimizes labeling errors and enhances the quality of identified bias patterns.

  • Bias Mitigation: Using insights from detected biases, the framework applies correction techniques such as Right for the Right Reason (RRR) loss functions and post-hoc model editing to unlearn undesired shortcuts while preserving valid features. These corrections significantly improve model robustness and generalization.

Applications and results

The researchers tested their framework across four medical datasets - ISIC2019 for melanoma detection, HyperKvasir for gastrointestinal abnormalities, CheXpert for chest radiographs, and PTB-XL for ECG data. They demonstrated its effectiveness in identifying and mitigating biases in models like VGG16, ResNet50, and Vision Transformers.

  • Bias Detection: The framework accurately detected spurious correlations caused by both real-world and artificially introduced artifacts, such as band-aids, timestamps, and pacemakers.
  • Bias Mitigation: Models corrected using the framework achieved significant improvements in robustness. For example, in the ISIC2019 dataset, the corrected model reduced reliance on artifacts while maintaining high accuracy on clean data.
  • Efficiency: The semi-automated annotation process minimized manual efforts, making the framework scalable for large datasets.

Implications for medical AI

This study emphasizes the critical role of explainability in medical AI systems, particularly in high-stakes scenarios where errors can have severe consequences. Explainable AI (XAI) provides transparency into the decision-making processes of deep learning models, addressing the longstanding challenge of their "black-box" nature. By incorporating XAI methods, this framework enables clinicians, researchers, and developers to identify and understand spurious model behaviors, such as reliance on non-causal features like data artifacts or demographic confounders. This understanding not only enhances trust in AI systems but also ensures that these models perform ethically and reliably in real-world medical applications.

The framework’s ability to detect and mitigate biases has a transformative impact on healthcare delivery. For example, in tasks like melanoma detection, ensuring that models focus on clinically relevant features rather than superficial cues (e.g., the presence of band-aids) can significantly improve diagnostic accuracy and patient safety. Similarly, in radiology, where models have been shown to learn shortcuts such as hospital-specific metadata, the proposed framework ensures that predictions are based on genuine pathological features. This leads to more equitable and generalizable models, particularly important in settings with diverse patient populations and varying data sources.

Moreover, the implications of this research extend far beyond the healthcare domain. Many AI-driven systems in other critical sectors, such as autonomous driving, aviation, and financial risk management, face similar challenges related to spurious correlations and opaque decision-making processes. By leveraging XAI techniques, these systems can be made more interpretable, robust, and fair. For instance, autonomous vehicles can use XAI-based methods to ensure their decision-making prioritizes safety-relevant features, like detecting pedestrians, over potentially misleading environmental factors. In finance, XAI can help identify and mitigate biases in models predicting creditworthiness, ensuring compliance with ethical and legal standards.

Future directions

While this framework represents a significant leap forward, the study identifies several areas for further exploration and refinement. One critical avenue is the development of disentangled concept spaces for bias representation. Current methods may encode multiple overlapping features within the same latent space direction, making it challenging to isolate specific biases. By creating disentangled concept spaces, researchers can represent biases more accurately, leading to improved detection and mitigation without affecting valid features.

Another promising direction involves the integration of domain-specific knowledge into the framework. In medical AI, leveraging expert annotations and clinical guidelines can enhance the detection of subtle, context-specific biases. For instance, models designed for skin cancer detection could incorporate dermatologist insights to distinguish between clinically meaningful patterns and irrelevant visual artifacts. Similarly, in cardiology applications, domain expertise could guide the model’s focus on pathophysiologically significant features in ECG data.

Additionally, there is scope for improving post-hoc model editing techniques to minimize collateral damage. While existing approaches, such as ClArC and RRR, effectively suppress biases, they may inadvertently degrade model performance on unrelated tasks or features entangled with the bias. Refining these methods to preserve valid model behaviors while removing only the spurious ones is essential for maintaining high levels of accuracy and generalizability. Future research could explore adaptive loss functions or targeted fine-tuning methods to achieve this balance.

Finally, as AI systems become increasingly complex, incorporating explainability at every stage of development - from data curation to model deployment - will be vital. Future efforts could focus on building frameworks that proactively identify and address biases during training rather than relying on reactive post-hoc corrections.

  • FIRST PUBLISHED IN:
  • Devdiscourse
Give Feedback