FLAME: Interpretable Matching for Causal Inference

Abstract

Matching methods are a class of techniques for estimating casual effects from observational data. Such methods match similar units together to emulate the randomization achieved by controlled experiments. Crucially, matching methods rely on a distance measure to determine similarity and thereby match units together. In this talk, we present an R package, FLAME, implementing the Fast, Large-scale Almost Matching Exactly (FLAME) and Dynamic Almost Matching Exactly (DAME) algorithms for performing matching on categorical datasets. These algorithms learn a weighted Hamming distance metric via machine learning on a held out dataset and match units directly on covariate values, prioritizing matches on more important covariates. The R package features an efficient bit-vectors implementation, allowing it to scale to datasets with hundreds of thousands of units and dozens of covariates, with a database implementation under development that allows it to operate on datasets too large to fit in memory. FLAME provides easy summarization, analysis, and visualization of treatment effect estimates, and features a wide variety of options for how matching is to be performed, allowing for users to make analysis-specific decisions throughout the matching procedure. We present an overview of the main functionality of the package and then illustrate an application to the 2010 US NCHS Natality Dataset, in which we study the effect of smoking during pregnancy on NICU admissions.

Publication
useR! 2021

Related