Artificial Intelligence Driven Explainable Machine Learning for High-Stakes Decision Support: SHAP Interpretability and Robustness Testing in Healthcare and Finance
DOI:
https://doi.org/10.63125/mv67zd78Keywords:
SHAP Explainability, Interpretability, Explanation Robustness, Trust in AI, Decision ConfidenceAbstract
This study addressed a persistent challenge in cloud-deployed, enterprise decision-support systems: even when machine-learning models exhibit strong predictive performance, users may under-rely or inconsistently rely on recommendations because post hoc explanations are perceived as unclear or unstable, constraining trust, defensibility, and confident action. The purpose of the study was to examine how explanation quality perceptions, specifically Perceived SHAP Interpretability (PSI) and Perceived Explanation Robustness (PER), are associated with Trust in AI decision support (TRU), Decision Confidence (DCF), and Intention to Rely or Use (INT) in high-stakes, case-based decision scenarios. A quantitative, cross-sectional, case-based design was employed using two enterprise contexts: healthcare clinical risk decision support and financial risk governance decision support. Standardized SHAP explanation artifacts were presented alongside decision vignettes, and a 5-point Likert-scale instrument measured PSI, PER, TRU, DCF, and INT, with controls for professional experience, AI familiarity, and domain group. The final sample consisted of N = 240 screened respondents (52.1% healthcare; 47.9% finance), with a mean professional experience of 7.8 years (SD = 4.6) and moderate AI familiarity (M = 3.62, SD = 0.84). The analytic approach combined descriptive statistics, reliability analysis, Pearson correlations, and a series of hierarchical multiple regression models. Mediation relationships were examined using regression-based mediation logic through sequential model estimation, assessing changes in direct effects when trust and confidence variables were introduced, rather than through causal path modeling or bootstrapped indirect effects. Reliability across constructs was strong (PSI α = .88; PER α = .86; TRU α = .90; DCF α = .87; INT α = .85). Mean scores exceeded the scale midpoint for all constructs (PSI M = 3.88; PER M = 3.61; TRU M = 3.74; DCF M = 3.69; INT M = 3.58). Correlation and regression results indicated that PSI and PER were positively associated with trust (PSI β = .41, p < .001; PER β = .34, p < .001; R² = .52). Trust accounted for substantial variance in decision confidence (β = .49, p < .001; DCF model R² = .57), and decision confidence accounted for substantial variance in intention to rely on AI recommendations (β = .43, p < .001; INT model R² = .49). When trust and confidence were included in the intention model, the direct associations of PSI and PER with intention were no longer statistically significant, indicating indirect relationships operating through trust and confidence. Overall, the findings suggest that SHAP interpretability and explanation robustness are associated with trust formation and confidence calibration, which together account for meaningful variance in reliance intentions in high-stakes healthcare and finance decision-support contexts.
