Sparse Explanations for Binary Classification
In high dimensional contexts, a sparsity-inducing model, such as LASSO, is useful in eliminating irrelevant predictors from the fit. However, this global approach to sparsity ignores the fact that some predictors might be irrelevant in explaining the outcomes of some units, but not others. Consider applying for a loan. An individual might be rejected because she has defaulted on a loan within the past 5 years. Other features (credit history, race, income, etc.) may not be irrelevant — but they are insufficient in overwhelming the effect of default on being denied a loan. These sorts of sparse explanations — where there are few features that determine a unit’s outcome — are interpretable and therefore of great value in many applied contexts. To this end, I’m working on developing a method for fitting linear, binary classification models that makes a unit’s resulting explanation a function of few predictors, while still maintaining good classification accuracy. I’m studying both a MIP formulation of this problem and a gradient-based optimization approximation as a faster alternative.