Devesh Narayanan (National University of Singapore) and Zhi Ming Tan (Cornell) have posted “Attitudinal Tensions in the Joint Pursuit of Explainable and Trusted AI” on SSRN. Here is the abstract:
It is frequently demanded that AI-based Decision Support Tools (AI-DSTs) ought to be both explainable to, and trusted by, those who use them. The joint pursuit of these two principles is ordinarily believed to be uncontroversial. In fact, a common view is that AI systems should be made explainable so that they can be trusted, and in turn, accepted by decision-makers. However, the moral scope of these two principles extends far beyond this particular instrumental connection. This paper argues that if we were to account for the rich and diverse moral reasons that ground the call for explainable AI, and fully consider what it means to “trust” AI in a full-blooded sense of the term, we would uncover a deep and persistent tension between the two principles. For explainable AI to usefully serve the pursuit of normatively desirable goals, decision-makers must carefully monitor and critically reflect on the content of an AI-DST’s explanation. This entails a deliberative attitude. Conversely, the call for full-blooded trust in AI-DSTs implies the disposition to put questions about their reliability out of mind. This entails an unquestioning attitude. As such, the joint pursuit of explainable and trusted AI calls on decision-makers to simultaneously adopt incompatible attitudes towards their AI-DST, which leads to an intractable implementation gap. We analyze this gap and explore its broader implications: suggesting that we may need alternate theoretical conceptualizations of what explainability and trust entail, and/or alternate decision-making arrangements that separate the requirements for trust and deliberation to different parties.