PhD Candidate, Department of Economics

Contact Information

Department of Economics
Northwestern University
2211 Campus Drive
Evanston, IL 60208 Phone:
872-203-4496

email: brunofava@u.northwestern.edu
website: bfava.com
 

 

 

Education

Ph.D., Economics, Northwestern University, 2026 (expected)
MA, Economics, Northwestern University, 2021
BA, Economics, Insper (Sao Paulo, Brazil), 2019

Primary Field of Specialization

Econometrics

Secondary Field of Specialization

Development Economics

Curriculum Vitae

Download CV

Job Market Paper

Training and Testing with Multiple Splits: A Central Limit Theorem for Split-Sample Estimators [link]
Abstract: As predictive algorithms grow in popularity, using the same dataset to both train and test a new model has become routine across research, policy, and industry. Sample-splitting attains valid inference on model properties by using separate subsamples to estimate the model and to evaluate it. However, this approach has two drawbacks, since each task uses only part of the data, and different splits can lead to widely different estimates. Averaging across multiple splits, I develop an inference approach that uses more data for training, uses the entire sample for testing, and improves reproducibility. I address the statistical dependence from reusing observations across splits by proving a new Central Limit Theorem for a large class of split-sample estimators under arguably mild and general conditions. Importantly, I make no restrictions on model complexity or convergence rates. I show that confidence intervals based on the normal approximation are valid for many applications, but may undercover in important cases of interest, such as comparing the performance between two models. I develop a new inference approach for such cases, explicitly accounting for the dependence across splits. Moreover, I provide a measure of reproducibility for p-values obtained from split-sample estimators. Finally, I apply my results to two important problems in development and public economics: predicting poverty and learning heterogeneous treatment effects in randomized experiments. I show that my inference approach with repeated cross-fitting achieves better power than previous alternatives, often enough to find statistical significance that would otherwise be missed.

Working Papers

Predicting the Distribution of Treatment Effects via Covariate-Adjustment, with an Application to Microcredit [arxiv] 

  • Best Student Paper Award at the 32nd Midwest Econometrics Group Annual Conference (2024)

Abstract: Important questions for impact evaluation require knowledge not only of average effects, but of the distribution of treatment effects. The inability to observe individual counterfactuals makes answering these empirical questions challenging. I propose an inference approach for points of the distribution of treatment effects by incorporating predicted counterfactuals through covariate adjustment. I provide finite-sample valid inference using sample-splitting, and asymptotically valid inference using cross-fitting, under arguably weak conditions. Revisiting five randomized controlled trials on microcredit that reported null average effects, I find important distributional impacts, with some individuals helped and others harmed by the increased credit access.

Algorithmic Targeting in Credit Markets: Consequences of Data-Driven Lending Practices [coming soon]
(with Susan Athey, Dean Karlan, Adam Osman, and Jonathan Zinman)

Abstract: New machine learning methods may allow lenders to increase profits by enhancing targeting decisions based on individual-specific information. But are those who are most profitable to the bank also those who benefit the most from receiving access to credit? Using data from three randomized controlled trials on microcredit and machine learning algorithms, we demonstrate that lenders can increase profits by up to 27\% through algorithm-driven lending decisions. However, the most profitable clients are often wealthier and more educated, shifting lending away from traditionally disadvantaged groups. We find no evidence that prioritizing lender profits negatively impacts borrower outcomes. Implementing an inclusive lending strategy that maximizes profits while maintaining average borrower income reduces lender profitability, achieving only one-third of the gains compared to profit maximizing targeting. These findings highlight the critical tensions arising from algorithmic credit allocation, emphasizing that as predictive technologies evolve, the trade-offs between profitability and social inclusion may intensify.

Publications

Probabilistic Nearest Neighbors Classification [download] [code]
Entropy, 2024, 26(1), 39. (with Paulo C. Marques F. and Hedibert F. Lopes)
Abstract: Analysis of the currently established Bayesian nearest neighbors classification model points to a connection between the computation of its normalizing constant and issues of NP-completeness. An alternative predictive model constructed by aggregating the predictive distributions of simpler nonlocal models is proposed, and analytic expressions for the normalizing constants of these nonlocal models are derived, ensuring polynomial time computation without approximations. Experiments with synthetic and real datasets showcase the predictive performance of the proposed predictive model.

The Illusion of the Illusion of Sparsity: An exercise in prior sensitivity [download] [code]
Brazilian Journal of Probability and Statistics, 2021, Vol. 35, No. 4, 699-720. (with Hedibert F. Lopes)
Abstract: The emergence of Big Data raises the question of how to model economic relations when there is a large number of possible explanatory variables. We revisit the issue by comparing the possibility of using dense or sparse models in a Bayesian approach, allowing for variable selection and shrinkage. More specifically, we discuss the results reached by Giannone, Lenza and Primiceri (2020) through a “Spike-and-Slab” prior, which suggest an “illusion of sparsity” in Economics datasets, as no clear patterns of sparsity could be detected. We make a further revision of the posterior distributions of the model, and propose three experiments to evaluate the robustness of the adopted prior distribution. We find that the pattern of sparsity is sensitive to the prior distribution of the regression coefficients, and present evidence that the model indirectly induces variable selection and shrinkage, which suggests that the “illusion of sparsity” could be, itself, an illusion. Code is available on Github.

Work in Progress

Is Participant Feedback Predictive of Impact?
(with Gharad Bryan, Dean Karlan, Isabel Oñate, and Christopher Udry)

What Can We Learn from Harmonizing and Analyzing RCTs of Grant and Training Programs to Promote Entrepreneurship?
(with Florian de Bundel, Dean Karlan, William Parienté, and Christopher Udry)

References

Prof. Federico Bugni (Committee Co-chair)
Prof. Dean Karlan (Committee Co-chair)
Prof. Ivan Canay
Prof. Joel Horowitz