Masters Thesis

Feature power: a new variable importance measure for random forests

Variable importance and interaction measures are crucial to breaking open the "black box" of machine-learned classifiers. The existing metrics, however, are datadriven and lack a solid mathematical foundation, resulting in misleading conclusions on certain types of data. We propose feature power: a new variable importance measure based on the Shapley value of cooperative game theory. We evaluate the validity of this new measure and the behavior of feature power in comparison to existing variable importance metrics. We also introduce coalition power: a methodology for quantifying the power of a group of features collectively. We demonstrate that both methods produce consistent, correct results on toy data and gain interesting insights by applying feature power to real data sets. We discuss the extensibility of both power measures to other tree-based ensembles and neural networks.

Relationships

In Collection:

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.