Masters Thesis

Building explainable random forest models with applications in protein functional analysis

Over the past few decades, Machine Learning (ML) researchers have made great progress in developing methods and approaches for building predictive models which perform very well in a variety of environments and situations. Unfortunately these models are also often difficult to understand and interpret, creating a barrier for more widespread adoption by users outside the ML research community. Our work focuses on providing better explainability in ML using the Random Forest method. In this work we formalize the problem of explainability and investigate existing and novel measures and methods for improving model explainability. We apply our approach to 7 protein functional models in the Stanford FEATURE dataset, and develop a one page “explainability summary” to support users interested in better .understanding protein function.

Relationships

In Collection:

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.