The integration of artificial intelligence (AI) into healthcare systems has revolutionized clinical analytics, enabling enhanced diagnostic accuracy, predictive modeling, and personalized treatment pathways. However, the opacity of many AI models poses significant challenges to their clinical adoption, necessitating advancements in explainable AI (XAI) to ensure interpretability and transparency. This narrative review synthesizes the literature on XAI within clinical systems, focusing on interpretability mechanisms, transparency frameworks, and deployment constraints in healthcare analytics. Drawing from high-impact studies, we examine how XAI addresses the “black box” nature of machine learning models in high-stakes medical decisions, particularly in contexts where performance has traditionally been prioritized over explainability. Key themes include the shift toward inherently interpretable models for critical applications, such as diagnostic imaging and predictive analytics, where post-hoc explanations often fall short. We explore the ethical imperatives for responsible AI deployment, including strategies for mitigating harm through transparent systems that align with clinical workflows. The review integrates perspectives on XAI in clinical diagnostics, emphasizing challenges in balancing model complexity with user trust. Transparency is framed not merely as a technical feature but as a systemic requirement, incorporating structured reporting practices for AI interventions and standardized modeling approaches. Deployment constraints are analyzed through the lens of real-world integration, including regulatory considerations, data privacy concerns, and human–AI interaction dynamics in healthcare infrastructures. We synthesize evidence from diverse applications, such as lung cancer diagnosis via explainable models and radiographic assessments, underscoring the need for multidisciplinary approaches to XAI. Furthermore, the review highlights biases in AI systems, particularly sex and gender disparities, and advocates for inclusive analytics to foster equitable healthcare. Clinical applications beyond the black box are discussed, with calls for standardized reporting to enhance reproducibility and trust. We position XAI as essential for closed-loop systems that incorporate feedback mechanisms, ensuring ongoing model recalibration in dynamic clinical environments. The synthesis reveals persistent gaps in current XAI deployments, such as overreliance on surrogate explanations that may mislead clinicians. Ultimately, this review proposes a systems-level framework for XAI in healthcare, integrating data ingestion, inference, decision support, and governance loops to overcome transparency barriers. This comprehensive overview informs the development of future AI-enabled healthcare infrastructures, emphasizing interpretability as a cornerstone for safe and effective clinical analytics.
The integration of artificial intelligence into clinical decision support systems offers improved diagnostic accuracy and efficiency, but the opacity of many machine learning models raises concerns about trust, accountability, and regulatory compliance. Explainable artificial intelligence (XAI) has been proposed to address this by making model predictions interpretable to clinicians; however, its true clinical value remains uncertain, and evaluation has not kept pace with methodological development. This systematic review aimed to identify XAI methods used in clinical decision support systems, assess how they are evaluated with clinicians, and determine whether explanations improve diagnostic accuracy, trust, mental models, and efficiency. Following PRISMA guidelines, we searched PubMed, Web of Science, IEEE Xplore, ACM Digital Library, and Scopus for studies published between 2017 and 2024. Eligible studies included original research evaluating XAI in clinical decision support systems with clinician participants and reporting quantitative or qualitative outcomes. Risk of bias was assessed using adapted QUADAS-2 and ROBIS tools, and findings were synthesized narratively with subgroup analyses. From 2,847 records, 68 studies were included. The most common XAI methods were SHAP-based feature attribution (38%), saliency or heatmap methods (29%), concept-based approaches such as TCAV (15%), and counterfactual or example-based explanations (12%). Radiology was the dominant field (54%), followed by dermatology (18%) and pathology (12%). Evaluation approaches were highly inconsistent, with few validated instruments and most studies relying on Likert-scale trust measures or qualitative feedback. Only 16% of studies showed improved diagnostic accuracy with explanations, 67% showed no significant effect, and 17% reported reduced accuracy due to over-reliance or misinterpretation. Although 82% of studies reported increased clinician trust, trust rarely correlated with actual diagnostic performance. Overall, while XAI methods are widely studied in clinical decision support, their evaluation is inconsistent and their benefits are limited. Explanations tend to increase clinician trust without reliably improving diagnostic accuracy, and may sometimes worsen performance, highlighting a trust–accuracy gap that poses important safety concerns for clinical deployment.
Postpartum hemorrhage (PPH) is the leading cause of maternal mortality worldwide, accounting for 25–30% of deaths, particularly in low-resource settings, and early identification of high-risk patients during labor could enable timely interventions such as uterotonic administration, blood preparation, and escalation of care; however, current risk stratification models rely mainly on static antepartum factors and fail to incorporate dynamic intrapartum physiological changes. Existing tools, including those from the California Maternal Quality Care Collaborative, use baseline maternal characteristics such as prior PPH, BMI, parity, and comorbidities, but do not capture continuously evolving labor data, despite intrapartum signals like fetal heart rate patterns, maternal vital sign trends, and labor progression metrics containing rich predictive information that remains underused in real-time decision-making, while clinical judgment is limited by inter-observer variability and inability to integrate complex temporal trends. To address this gap, we propose an explainable gradient boosting machine framework for real-time PPH risk prediction that integrates electronic fetal monitoring parameters (baseline rate, variability, decelerations), maternal vital signs (heart rate, blood pressure, temperature, oxygen saturation), and labor progression features (cervical dilation, contraction frequency, stage duration, and oxytocin use), producing continuously updated risk scores throughout labor. The system combines a gradient boosting model (XGBoost or LightGBM), a SHAP-based explainability module, a real-time feature extraction pipeline, and a clinician-facing dashboard that displays risk scores and key contributing factors, where SHAP provides both global and patient-specific interpretability by identifying how features such as tachysystole or prolonged labor stages influence predictions, thereby improving transparency and clinical trust. Overall, this framework enables dynamic, interpretable PPH risk assessment using routinely collected intrapartum data, combining predictive accuracy with explainability to support earlier detection of hemorrhage risk and more timely, targeted interventions.
Anticoagulation management requires balancing multiple factors such as bleeding risk, thromboembolic risk, drug interactions, and renal function. Deep learning can assist in risk prediction, but its effectiveness relies on clinicians' ability to understand and verify the recommendations. Black-box models may recommend actions without providing clear explanations. In contrast, clinical guidelines are rule-based but not directly executable by neural models. This article introduces a neuro-symbolic XAI framework that combines deep learning predictions with explicit clinical guidelines. It includes a neural prediction module, a symbolic reasoning engine, and an integration layer for traceable justifications. The neuro-symbolic approach connects data-driven predictions to clinical rules, improving auditability and trustworthiness in decision support. This framework aims to enhance anticoagulation management by providing verifiable, clinician-understandable decision support, focusing on explainability-by-design.