Latent Variable BART
Bayesian Additive Regression Trees, or BART, is a Bayesian nonparametric regression tool that fits data via a forest of decision trees, each of which is constrained not to dominate the overall fit. In this sense, it is similar to boosting, though it has the added advantage of allowing for posterior inference on any quantities of interest. In the decade since its conception, BART has established itself as a cutting-edge regression tool capable of state of the art performance on a wide variety of complex regression problems. Nevertheless, there are many types of data and models to which BART cannot yet appropriately be applied — notably those involving random effects, or latent variables in general. While it is easy to use BART and a random effect term in modeling data, it is not so simple to include the random effect within BART, given the difficulty of having a decision tree properly split on an unknown, latent variable. I’ve developed a method of doing this in the context of latent-variable density regression allowing BART’s inherent flexibility to be applied to accurate estimation of complicated conditional densities. Next, I hope to extend this methodological advancement to handle random effects models, models in which BART simultaneously performs community detection and regression (with community-specific parameters), and others.