A Guide to ARTs

In Canay, Romano, and Shaikh (2017) we extended the scope of applicability of randomization tests to cases where such tests could not be justified in finite samples but where it was possible to argue their asymptotic validity under fairly mild conditions. This led to what we call Approximate Randomization Tests (ARTs). An important setting where such tests proved to be particularly useful is the one where the data can be grouped into a small number of clusters.

While ARTs are not particularly difficult to implement from a computational stand point, the principal goal of Canay, Romano, and Shaikh (2017) was to develop the general theory of ARTs and did not focus on the details behind its implementation. In Cai, Canay, Kim, and Shaikh (2021) we now provide a user’s guide to the general theory of ARTs when specialized to linear regressions with clustered data. Such regressions include settings in which the data is naturally grouped into clusters, such as villages or repeated observations over time on individual units, as well as settings with weak temporal dependence, in which pseudo-clusters may be formed using blocks of consecutive observations. An important feature of the methodology is that it applies to settings in which the number of clusters is small – even as small as five.

Cai, Canay, Kim, and Shaikh (2021) provides a step-by-step algorithmic description of how to implement ARTs and construct confidence intervals for the parameters of interest. We additionally articulate the main requirements underlying the test, emphasizing in particular common pitfalls that researchers may encounter. Finally, we illustrate the use of the methodology with two applications that further elucidate these points: one to a linear regression with clustered data based on Meng et al. (2015) and a second to a linear regression with temporally dependent data based on Munyo and Rossi (2015).

In order to facility adoption of ARTs, we have developed a companion Stata package (see the ARTs Bitbucket Repository or visit the software page) and also provided R and Stata files to replicate the two empirical exercises in the paper (replication files).

 

Stata: Inference under CAR

We have finished the first version of a Stata package that computes point estimates and standard errors for average treatment effects in randomized controlled experiments with covariate-adaptive randomization. Below you can download the package (which includes two  ado files and an example of how to use them) and the paper introducing the adjustments to the standard errors in each type of regression (SFE and SAT). Visit the software page here for additional Stata and R packages.

New paper and Stata package for continuity in RDD

In the regression discontinuity design (RDD), it is common practice to assess the credibility of the design by testing the continuity of the density of the running variable at the cut-off, e.g., McCrary (2008). In joint work with Federico Bugni, we propose a new test for continuity of a density at a point based on the so-called g-order statistics, and study its properties under a novel asymptotic framework. The asymptotic framework is intended to approximate a small sample phenomenon: even though the total number n of observations may be large, the number of effective observations local to the cut-off is often small. Thus, while traditional asymptotics in RDD require a growing number of observations local to the cut-off as n grows, our framework allows for the number q of observations local to the cut-off to be fixed as n grows. The new test is easy to implement, asymptotically valid under weaker conditions than those used by competing methods, exhibits finite sample validity under stronger conditions than those needed for its asymptotic validity, and has favorable power properties against certain alternatives. You can find a copy of the paper here.

We have also finished the first version of a Stata package that implements the new test we propose. You can download the package from the Bitbucket repository (Rdcont), which includes the ado file with an example of how to use it. Visit the software page here for additional Stata and R packages.

Stata: inference with few clusters

B03506_02_04We have finished the first version of a Stata package  that computes the approximate randomization test for inference in models with a small number of clusters. Below you can download the package (which includes the ado file with an example of how to use it), a brief tutorial, and the paper introducing the new test. Visit the software page here for additional Stata and R packages.

New software page

software_XXXLargeA new software page is now available here that contains all current (and will contain all future) Stata and R packages. These packages are in a Bitbucket repository, where you can download each package and also leave comments for issues/problems or suggestions for new features. All the Stata and R modules are distributed under the terms of the license files in the repositories. In particular, the software is provided “as is”, without warranty of any kind. If you have suggestions or requests about any of these software packages, please use the corresponding Bitbucket issues feature so that we can keep track of it.

Stata: permutation tests for RDD

3171924293_82c9d1925d

Coffee permutations

We have finished the first version of a Stata package that computes the approximate permutation test developed by Canay and Kamat (2016) for the regression discontinuity design. Below you can download the package (which includes the ado file with an example of how to use it) and the paper introducing the new test. For additional Stata and R packages, visit the software page here.