Construction of Artificial Most Representative Trees by Minimizing Tree-Based Distance Measures

Björn Hergen Laabs*, Lea L. Kronziel, Inke R. König, Silke Szymczak

*Corresponding author for this work
2 Citations (Scopus)

Abstract

The random forest (RF) algorithm is known for its predictive performance but has been criticized for its lack of interpretability due to its complex ensemble nature. To address the issue of explainability our study questions the traditional approach of using most representative trees (MRTs) to simplify RF interpretation, highlighting the potential for misinterpretation due to non-informative early splits. To overcome these limitations, we propose a new method involving the construction of artificial representative trees (ARTs) through a greedy algorithm that iteratively builds a tree to minimize the distance to the RF ensemble, thereby preserving the predictive performance of the RF. We give a detailed description of the methodological framework for ART construction, including strategies for reducing computational complexity through variable preselection and quantile-based splitting. Results from extensive simulations demonstrate that ARTs provide a more accurate reflection of the RF's predictive performance and substantially reduce the false discovery rate, thus offering a more reliable interpretative model. The findings suggest that ARTs represent an advance in addressing the interpretation of RF models.

Original languageEnglish
Title of host publicationCommunications in Computer and Information Science
Publication date2024
Publication statusPublished - 2024

Fingerprint

Dive into the research topics of 'Construction of Artificial Most Representative Trees by Minimizing Tree-Based Distance Measures'. Together they form a unique fingerprint.

Cite this