← See Stata 19's new features
Highlights
Three weighting estimators:
Normalized kappa weighted
Normalized covariate-balancing propensity scores
Inverse-probability-weighted regression adjustment
Continuous, binary, count, and fractional outcomes
Balancing diagnostics and overidentification test
Overlap plots
Normalized kappa weighted covariate statistics
It is hard to identify treatment effects when individuals do not comply with their assigned treatment—when there are unobservable characteristics that determine whether the treatment was actually taken. But sometimes, we are fortunate. We know the assigned treatments or have information on what motivates individuals to be treated, an instrument. lateffects leverages this instrument to provide meaningful causal effects. This feature is a part of StataNow™.
lateffects estimates a local average treatment effect (LATE), a causal effect for a subpopulation. We would prefer to identify an effect for the entire population, but in many instances, this is not feasible. We cannot identify a treatment effect for the population because of unobservable differences between treated units and untreated units. Unaccounted for, unobservable differences confound any causal effect we would like to identify. However, in an experimental setting, we may know the treatment to which each individual was randomly assigned. Or in an observational setting, we may have a binary exogenous variable that encourages individuals to take a treatment. With such a variable that splits the population into treated and untreated units as if they were randomly assigned, we can identify a treatment effect for those that comply with the treatment assignment, a LATE. Because the effect is identified only for compliers, the estimand is sometimes referred to as a complier average treatment effect.
Without covariates, one can estimate a LATE using two-stage least squares. But when covariates are incorporated into the model, the equivalence breaks. In other words, using two-stage least squares may not give you the causal effect you want. lateffects gives you a LATE with and without covariates.
We would like to study how participating in an elementary school program that targets reading ability, participate, affects the scores of a high school state exam used in college admissions, score. Enrollment in the program is voluntary. Therefore, participation is not randomized, which will confound the effects we are measuring.
Fortunately, there was a lottery that selected some individuals within schools to participate in the program and some to be controls. The lottery gave more weight to children in low-income households. Families of selected students received a small one-time cash transfer if they decided to participate. The binary variable selected is 1 for those selected to participate. Of course, not all that were selected by the lottery participated in the program, and some students participated regardless of the lottery outcome.
If the lottery had selected students randomly, we could use selected as an instrument and estimate the LATE. However, because we know that the lottery was based on income, we need to specify a model so that after adjusting for covariates, the selection is as if it were random. We conjecture that once we control for whether an individual lives in a rural area, rural—which are poorer than urban areas for our population—and for whether they live in a low-income zone, lowinc, the instrument is as good as randomly assigned. Using lateffects, we can fit a model to obtain a LATE of participating in the program on the exam score for those that complied with the treatment assignment. We type
. lateffects kappa (score) (participate) (selected i.lowinc i.rural)
We chose the normalized kappa weighted estimator for illustration. The first set of parentheses corresponds to the outcome, the second to the treatment, and the last to the instrument propensity-score model.
We obtain the following result:
. lateffects kappa (score) (participate) (selected i.lowinc i.rural) Iteration 0: EE criterion = 2.683e-19 Iteration 1: EE criterion = 5.025e-31 Local average treatment effect Number of obs = 2,134 Estimator: Normalized kappa Outcome model: Weighted Mean Treatment model: Weighted Mean IV pscore model: Logit
Robust | ||
score | Coefficient std. err. z P>|z| [95% conf. interval] | |
LATE | ||
participate | ||
(1 vs 0) | .8205241 .2101326 3.90 0.000 .4086717 1.232376 | |
The LATE is 0.82. This means that for the subpopulation of students that comply with the lottery, scores on the state exam would be 0.82 higher on average if they all participate in the reading program than if none of them participates. Scores are measured on a continuous scale from 1 to 10.
Suppose that we believe it is important to consider a student's grade point average (GPA) in 2005, gpa2005, to model both the treatment and the outcome. To model treatment or outcome, we need the inverse-probability-weighted regression adjustment (IPWRA) estimator. We type
. lateffects ipwra (score gpa2005) (participate gpa2005) (selected i.lowinc i.rural) Iteration 0: EE criterion = 3.126e-19 Iteration 1: EE criterion = 5.159e-31 Local average treatment effect Number of obs = 2,134 Estimator: IPW regression adjustment Outcome model: Linear Treatment model: Logit IV pscore model: Logit
Robust | ||
score | Coefficient std. err. z P>|z| [95% conf. interval] | |
LATE | ||
participate | ||
(1 vs 0) | 1.07929 .1433535 7.53 0.000 .7983219 1.360257 | |
The LATE is larger when we account for GPA. However, in both cases, it is around 1 point.
After lateffects, we can inspect whether the treatment assignment is as if it were random after controlling for covariates. We may type
. latebalance summarize Covariate balance summary
Number of observations | Raw Weighted | |
Total | 2,134 2,134 | |
Assigned to treatment | 612 1,066.896 | |
Assigned to control | 1,522 1,067.104 | |
Standardized differences Variance ratio | ||
Raw Weighted Raw Weighted | ||
lowinc | ||
1 | .3099247 .0001318 1.710866 1.000252 | |
rural | ||
1 | .2250566 .0002798 1.346618 1.000404 | |
In the first table of the output, we see that after weighting by the probabilities of being assigned to treatment or control, the treatment and control groups are roughly balanced relative to the sample, which has 612 observations assigned to treatment and 1,522 assigned to be controls. Similarly, we see that the standardized differences after weighting are close to 0 and the variance ratios are close to 1. This suggests that after controlling for covariates, treatment assignment is as if it were random. This provides support for our modeling choices.
There is much more we can do. Above, we fit a linear model. When we have a binary, count, or fractional outcome, we can fit probit, logit, Poisson, fractional probit, or fractional logit models. We can also further check assumptions. We could verify that the overlap assumption is satisfied using lateoverlap. Additionally, we could look at the mean of covariates for the compliers to characterize the complier subpopulation using estat compliers.
Read more about LATE in [CAUSAL] lateffects in the Stata Causal Inference and Treatment-Effects Estimation Reference Manual.
Learn more about Stata's causal inference features.
View all the new features in Stata 19 and, in particular, new in causal inference.