Reliance and Trust on XAI Today

Evaluating the Influences of Explanation Style on Human-AI Reliance

Fri, 01 Nov 2024 00:00:00 +0000

The reccent paper “Evaluating the Influences of Explanation Style on Human-AI Reliance” investigates how different types of explanations affect human reliance on AI systems. The research focused on three explanation styles: feature-based, example-based, and a combined approach, with each style hypothesized to influence human-AI reliance in unique ways. A two-part experiment with 274 participants explored how these explanation styles impact reliance and interpretability in a human-AI collaboration task, specifically using a bird identification task. The study sought to address mixed evidence from previous literature on whether certain explanation styles reduce over-reliance on AI or improve human decision-making accuracy.

To study human responses to various AI explanations, the researchers used a quantitative methodology, measuring reliance through initial and final decision accuracy shifts. The study employed the Judge-Advisor System (JAS) model to capture differences in human reliance before and after AI assistance. Key measures included the Appropriateness of Reliance (AoR) framework, developed by Schemmer et al., which introduced two metrics: Relative AI Reliance (RAIR) and Relative Self-Reliance (RSR). These metrics quantified reliance by assessing how often humans appropriately switched to AI-supported decisions or maintained their initial, correct judgments. The researchers noted individual participant performance variations, revealing that higher-performing individuals demonstrated different reliance patterns compared to lower-performing ones, particularly when interacting with high-complexity tasks.

The quantitative approach included metrics from the AoR model and the JAS framework. Both RAIR and RSR metrics from AoR provided a structured comparison across explanation styles by evaluating the effect of explanations on reliance in human-AI interactions. While the reliance metrics were based on existing literature, this study extended their use by separating participants based on individual performance, creating a novel analysis approach. Additionally, accuracy shift measures captured how reliance on AI suggestions varied with task complexity and participant ability. This nuanced view highlighted reliance discrepancies based on cognitive engagement, suggesting that explanation styles should be tailored to user expertise and task requirements.

The paper emphasizes the importance of Explainable AI (XAI) for human-in-the-loop tasks, where humans need to understand and trust AI recommendations effectively. Such explanations can calibrate user trust, ideally preventing over-reliance on incorrect AI outputs. XAI’s value is underscored in collaborative tasks, but this study’s focus on bird classification, although useful for understanding complex identification tasks, may not directly apply to more general, real-world applications. The research reveals challenges in establishing broad applicability for explanation methods due to the inherent limitations in specific experimental tasks that may not fully capture the varied decision-making contexts encountered in real-life scenarios.

Despite these limitations, the study makes a significant contribution to the ongoing discourse on XAI by providing evidence that certain explanation forms (example-based, feature-based, or combined) affect reliance differently. This research highlights that example-based explanations, though beneficial for identifying incorrect AI suggestions, can also foster over-reliance, particularly when high-quality explanations reinforce trust. The paper suggests that balancing clarity and trust calibration remains an open question, crucial for advancing reliable human-AI collaboration frameworks.

In summary, this work advances the conversation on XAI by demonstrating that different explanation styles yield complex, context-dependent effects on human reliance. Although the field has matured in recent years, debates persist about the ideal form and substance of explanations. This study contributes to understanding the nuanced roles of example- and feature-based explanations, highlighting that while explanations enhance interpretability, they do not uniformly improve reliance, particularly across varied user expertise levels and decision contexts. This reinforces the need for adaptable XAI methods that align with diverse human-AI collaboration needs.

How Subsets of the Training Data Aﬀect a Prediction

Sun, 20 Dec 2020 00:00:00 +0000

I was quite excited by the title of a new paper, on pre-publication this month. “Explainable Artificial Intelligence: How Subsets of the Training Data Affect a Prediction” by Andreas Brandsæter and Ingrid K. Glad, at first glance, appeared to have some close alignment to my own work CHIRPS: Explaining random forest classification, published earlier this year in June. It’s generally highly desirable to connect with other researchers with which you share common ground, working contemporaneously. Often, fruitful collaborations are born.

As it turns out, the authors have taken a fairly different approach to mine. The CHIRPS method discovers a large, high precision subset of neighbours in the training data, using a minimal number of constraints, that share the same classification from the model, and returns robust statistics that proxy for precision and coverage. Brandsæter and Glad’s method is a novel approach that works with regression and time series problems, and pre-supposes that there are subsets in the data (that may or may not be adjacent) that can be set up in advance to reveal regions of influence on the final prediction of a given data point. We share a recognition of the importance of interpretability in AI and machine learning, especially in critical applications.

Tthe authors propose a methodology that uses Shapley values to measure the importance of different training data subsets in shaping model predictions. Shapley values, originating from coalitional game theory, are adapted here to quantify the contribution of each subset of training data as if each subset were a “player” influencing the outcome of the model’s prediction. This approach offers a fresh perspective by directly associating predictions with specific training data subsets, which can reveal patterns or biases that feature-based explanations might miss.

The paper delves into the theoretical framework of Shapley values in a coalitional game context and extends this to analyze subset importance. The authors describe how their methodology can pinpoint the impact of specific subsets on predictions, facilitating insights into model behavior, training data errors, and potential biases. By using subsets rather than individual data points or features, this approach is particularly well-suited to models that rely on large, high-dimensional datasets where feature importance alone may not fully capture influential patterns. This method is demonstrated to be useful in understanding how similar predictions may stem from different subsets of data, emphasizing the complex interactions within training data that influence predictions.

Through several case studies, the paper demonstrates how Shapley values for subset importance can be applied in real-world scenarios. For example, in time series data and autonomous vehicle predictions, subsets of training data based on chronological segmentation reveal how specific periods contribute to model outputs. This approach is shown to be valuable for identifying anomalies or segment-specific patterns that could affect model accuracy or introduce biases. Additionally, by explaining the squared error for predictions, the authors illustrate how this methodology can also diagnose errors in training data, which could improve overall model reliability.

The authors discuss limitations and challenges, particularly around the computational complexity of retraining models on multiple subsets to calculate Shapley values. They suggest that, while computationally intensive, this process can be optimized with parallel processing and may not need to be repeated for each new test instance. They also propose potential applications of this methodology in tailoring training data acquisition strategies, such as for cases where predictions are most critical, which can improve model performance by selectively sampling from influential subsets.

In conclusion, Brandsæter and Glad’s paper represents a significant advancement in explainable AI by emphasizing the training data’s impact on model predictions. By shifting focus to data-centric explanations, their approach highlights how subsets within the data contribute directly to individual predictions, expanding the interpretative toolkit beyond traditional feature importance. This approach aligns with my own work on CHIRPS, underscoring the notion that providing contextual information from training data strengthens model transparency and interpretability. Using training data as a reference framework enables explainable AI methods to draw on established statistical theory, which ultimately lends robustness to explanations, even in black-box models. Together, these methods suggest a promising direction for explainable AI, wherein training data subsets serve as crucial elements to understand and elucidate model behavior effectively.

Faithful and Customizable Explanations of Black Box Models

Sun, 05 Jan 2020 00:00:00 +0000

The authors of “Faithful and Customizable Explanations of Black Box Models” (MUSE) share a common goal with my own research: addressing the challenge of making machine learning models interpretable. Both emphasize the importance of transparency in decision-making, particularly in scenarios where human trust and understanding are critical, such as healthcare, judicial decisions, and financial assessments. Both they and I see decision rule structures as the ideal explanation format to explain model behaviour.

MUSE uses a two-level decision set framework, which combines subspace descriptors and decision logic to generate explanations for different regions of the feature space. This is useful for zooming in to specific features and observation subsets of interest. Just like my own research, this is a highly user centric approach, emphasising a Human-in-the-Loop process of expert review of model decisions. My method differs in that it facilitates an individual detail review, potentially allowing the expert user to respond to individuals seeking some kind of review or redress over an automated decision. In essence, this is a response to the “computer says no” problem. The explanations are tailored to specific needs or contexts.

This focus on end-user interaction reflects a broader effort in both frameworks to build trust in machine learning outputs by providing meaningful insights. Despite these similarities, the research ideas diverge in significant ways. MUSE has a broader scope, offering global explanations as well as targeted insights into specific subspaces of the model’s behaviour. It is designed to be model-agnostic, meaning it can work with any type of predictive system. My research has a specific focus on Decision Tree ensembles (Random Forest and Boosting methods), explaining how such a classifier reached a decision for a particular data point, emphasising precision and counterfactual reasoning.

The methodologies also differ. MUSE employs optimization techniques to create compact and interpretable decision sets that balance fidelity, unambiguity, and interpretability. My approach, in contrast, extracts decision paths from random forests using frequent pattern mining, constructing rules that highlight the most influential attributes in a model’s classification. These distinct methods reflect their differing objectives: MUSE aims to provide a comprehensive view of a model’s behaviour, while I seek to zero in on individual classifications with a high degree of local accuracy.

Together, these research approaches represent two sides of the same coin: one offering a high-level overview and the other delivering precise, localised explanations. There is a lot of scope for combining the two methods in a collaborative framework for holistic explanations.