A Comparative Analysis of Interpretable Machine Learning Methods

AI in healthcare
Published: arXiv: 2601.00428v1
Authors

Mattia Billa Giovanni Orlandi Veronica Guidetti Federica Mandreoli

Abstract

In recent years, Machine Learning (ML) has seen widespread adoption across a broad range of sectors, including high-stakes domains such as healthcare, finance, and law. This growing reliance has raised increasing concerns regarding model interpretability and accountability, particularly as legal and regulatory frameworks place tighter constraints on using black-box models in critical applications. Although interpretable ML has attracted substantial attention, systematic evaluations of inherently interpretable models, especially for tabular data, remain relatively scarce and often focus primarily on aggregated performance outcomes. To address this gap, we present a large-scale comparative evaluation of 16 inherently interpretable methods, ranging from classical linear models and decision trees to more recent approaches such as Explainable Boosting Machines (EBMs), Symbolic Regression (SR), and Generalized Optimal Sparse Decision Trees (GOSDT). Our study spans 216 real-world tabular datasets and goes beyond aggregate rankings by stratifying performance according to structural dataset characteristics, including dimensionality, sample size, linearity, and class imbalance. In addition, we assess training time and robustness under controlled distributional shifts. Our results reveal clear performance hierarchies, especially for regression tasks, where EBMs consistently achieve strong predictive accuracy. At the same time, we show that performance is highly context-dependent: SR and Interpretable Generalized Additive Neural Networks (IGANNs) perform particularly well in non-linear regimes, while GOSDT models exhibit pronounced sensitivity to class imbalance. Overall, these findings provide practical guidance for practitioners seeking a balance between interpretability and predictive performance, and contribute to a deeper empirical understanding of interpretable modeling for tabular data.

Paper Summary

Problem
The main problem addressed in this research paper is the lack of transparency and understanding in complex machine learning models, particularly in high-stakes domains like healthcare, finance, and law. These "black box" models can make decisions without explaining how they arrived at those conclusions, which can lead to accountability issues and mistrust.
Key Innovation
This paper presents a large-scale comparative evaluation of 16 inherently interpretable machine learning models, which are transparent by design and provide exact explanations of their decision-making processes. This is a significant innovation because it offers a comprehensive assessment of model behavior across diverse problem types, moving beyond aggregate performance and enabling a more fine-grained and interpretable analysis.
Practical Impact
The practical impact of this research is significant, particularly in high-stakes domains where transparency and accountability are crucial. By evaluating and comparing different interpretable models, this study can help practitioners choose the best models for their specific needs, ensuring that decisions are made with a clear understanding of the underlying logic. This can lead to increased trust in AI systems, improved decision-making, and better outcomes in various fields.
Analogy / Intuitive Explanation
Think of machine learning models like a recipe book. Traditional models are like a complex, multi-page cookbook with intricate instructions and hidden ingredients. Interpretable models, on the other hand, are like a simple recipe book with clear, step-by-step instructions and transparent ingredients. This study helps us understand which types of recipe books (models) are best suited for different types of dishes (problems), ensuring that we can make informed decisions and trust the outcome.
Paper Information
Categories:
cs.LG
Published Date:

arXiv ID:

2601.00428v1

Quick Actions