AI that says ‘I Don’t Know’ could make technology safer
The study also warns of potential pitfalls. In some implementations, particularly those involving labeled abstention, abstention becomes just another predefined output, no different from other labels like yes or no. This undermines its conceptual role as a fallback or meta-cognitive response, reducing abstention to a rigid classification rather than a nuanced recognition of uncertainty.
Machine learning systems that choose not to answer when uncertain, a capability known as abstention, are being closely linked to a philosophical concept long studied in human reasoning: the suspension of judgment. A new study "Abstaining Machine Learning: Philosophical Considerations" published in AI & Society argues that these abstaining systems may be the closest current approximation of “cautious” or “neutral” artificial intelligence, offering both ethical benefits and novel technical challenges in AI deployment.
The paper, authored by Daniela Schuster at the University of Konstanz, proposes a cross-disciplinary framework for understanding abstaining machine learning (AML) by drawing analogies from philosophy. It examines how systems that opt out of classification under uncertainty exhibit behaviors similar to epistemic suspension - a mental state where a person neither believes nor disbelieves a proposition due to insufficient evidence.
The study distinguishes between different reasons for abstention, namely ambiguity and outlier scenarios. Ambiguity abstention occurs when an AI system receives conflicting data that makes multiple outputs equally likely, while outlier abstention is triggered when a system encounters unfamiliar input - such as data too dissimilar from anything it was trained on. These abstaining responses parallel, respectively, two philosophical justifications for suspending belief: either due to conflicting evidence or due to a lack of sufficient knowledge.
Schuster further categorizes AI abstention methods into two core types: attached and merged systems. Attached abstention systems have a distinct component that either pre-empts or overrides a prediction decision, depending on whether confidence in the output is too low. Merged systems, by contrast, integrate abstention directly into the decision-making architecture, treating it as a valid output option alongside conventional answers.
The distinction has important implications. According to the paper, only merged systems qualify as engaging in abstention that is conceptually similar to human suspension of judgment. In merged systems, abstaining is an internal, learned response to uncertainty about the task itself—not a reaction imposed after the fact. This alignment matters, the author argues, because it ties into long-standing philosophical debates about autonomy, explainability, and the nature of intelligent behavior.
In particular, merged systems show potential to meet two key criteria often used to assess artificial autonomy: the ability to learn flexibly rather than follow hard-coded rules, and the capacity to adapt behavior based on input. These systems determine on their own when to abstain based on patterns they observe during training, rather than relying on preset thresholds or external override mechanisms. This means that merged abstention systems can evolve their understanding of when it's appropriate to say "I don't know," mimicking one of the core features of intelligent human judgment.
On explainability, merged systems again outperform their attached counterparts. Because abstention is part of their learned behavior, these systems can trace abstaining outputs back to specific features in the input data. For example, using heatmaps or feature maps, a system might highlight the portion of an image that triggered uncertainty. This contrasts with attached systems, which often cannot explain abstention beyond citing a low-confidence score or generic uncertainty metric.
The ethical importance of this capability is especially relevant in high-stakes domains. Medical diagnostics, financial screening, and autonomous driving systems are increasingly powered by machine learning models. In such contexts, Schuster argues, it is safer and more transparent for a system to abstain, flagging an unclear case for human review, than to issue a possibly flawed or overconfident judgment. This mirrors how human experts sometimes defer decisions, admitting uncertainty rather than risking a critical error.
The study also warns of potential pitfalls. In some implementations, particularly those involving labeled abstention, abstention becomes just another predefined output, no different from other labels like yes or no. This undermines its conceptual role as a fallback or meta-cognitive response, reducing abstention to a rigid classification rather than a nuanced recognition of uncertainty.
Furthermore, the paper notes that training data often lacks the diversity needed to teach meaningful abstention behaviors. For example, systems might not learn to abstain appropriately if they haven’t encountered ambiguous or outlier cases during training. This underscores the need for carefully curated datasets and loss functions that penalize incorrect outputs more heavily than abstention itself, thereby reinforcing conservative behavior when appropriate.
Philosophically, the paper makes a compelling case for treating abstaining machine learning as a meaningful domain of study. By aligning AI behaviors with human epistemic attitudes, the analysis opens new avenues for responsible AI design, especially in an era where automated systems increasingly affect daily life. Rather than building machines that always answer, the future may lie in systems that know when not to.
While Schuster’s research focuses primarily on classification systems in medicine and computer vision, the implications are broader. Any AI tasked with high-confidence decision-making, whether in law, finance, or governance, may benefit from an abstaining option. The paper encourages researchers to explore abstention not as a weakness but as a principled choice rooted in human-like reasoning.
The author concludes that abstaining ML systems, particularly those with merged architectures, should be further developed as part of a wider push toward explainable and trustworthy AI. She recommends that future research examine whether the capacity to abstain, even without explaining why, already contributes to perceptions of intelligence and transparency in artificial systems.
- READ MORE ON:
- abstaining machine learning
- AI uncertainty handling
- AI that says I don’t know
- artificial intelligence decision making
- trustworthy machine learning
- explainable AI systems
- AI that avoids wrong answers
- how AI can choose not to answer a question
- AI models that say “I don’t know” for safety
- FIRST PUBLISHED IN:
- Devdiscourse

