[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Difference in Difference for Proportions

From   Misha Spisok <>
Subject   st: Difference in Difference for Proportions
Date   Thu, 17 Sep 2009 13:30:30 -0700

Hello, Statalist,

In brief, how does one test a difference in difference of proportions?
 My question is re-stated briefly at the end with reference to the
variables I present.  A formula and/or reference would be appreciated
if no command exists.

I would like to test a difference in difference of proportions.
-prtest- and -prtesti- do not work (easily) for my data, even for a
simple test of differences.  I have data grouped such that, for N
states, I have the number of persons in state i with a condition (the
variable for that count is f) and the population of state i in year y
(pop).  A "treatment" is applied and the pre-treatment period is t=0
and the post-treatment period is t=1.  One can consider south=1 to be
the treated and south=0 to be the non-treated group.

For example, some observations may look like this:
state  year   pop          f           t   south
1       1990  1200000  10000   0  0
50     1990  3000000  900       0  1
1       2000  1500000  21000   1  0
50     2000  3900000  2900     1  1

For differences in proportions within, for example, the pre-treatment
period, for states in two regions (south==0 and south==1), I use,

egen f_north_0 = sum(f) if south==0 & t==0
egen pop_north_0 = sum(pop) if south==0 & t==0

egen f_south_0 = sum(f) if south==1 & t==0
egen pop_south_0 = sum(pop) if south==1 & t==0

gen phat_n_0 = f_north_0/pop_north_0 /* proportion in north pre-treatment */
gen phat_s_0 = f_south_0/pop_south_0 /* proportion in south pre-treatment */
gen sp_n_0 = sqrt(phat_n_0*(1 - phat_n_0)/pop_north_0) /* standard
error for phat_n_0 */
gen sp_s_0 = sqrt(phat_s_0*(1 - phat_s_0)/pop_south_0) /* standard
error for phat_s_0 */

egen fn_0 = mean(f_north_0)
egen fs_0 = mean(f_south_0)
egen pn_0 = mean(pop_north_0)
egen ps_0 = mean(pop_south_0)

gen phat_0 = (fn_0 + fs_0)/(pn_0 + ps_0) /* pooled proportion, pre-treatment */
gen qhat_0 = 1 - phat_0

gen sp_0 = sqrt(phat_0*qhat_0*(1/pn_0 + 1/ps_0)) /* standard error of
difference of proportions */

gen z_0 = (fs_0/ps_0 - fn_0/pn_0)/sp_0

(At this point I suppose I could use -prtesti- by summarizing the
relevant variables then typing the results into the prtesti
command...In any case, I think that neither -prtest- nor -prtesti-
will help me with testing a difference in differences.)

This, it would seem, allows me to test the difference in proportions
in the pre-treatment period.  Similarly, if I generate similar values
for the post-treatment period, I can test the difference in
proportions in the post-treatment period.

egen f_north_1 = sum(f) if south==0 & t==1
egen pop_north_1 = sum(pop) if south==0 & t==1

egen f_south_1 = sum(f) if south==1 & t==1
egen pop_south_1 = sum(pop) if south==1 & t==1

gen phat_n_1 = f_north_1/pop_north_1
gen phat_s_1 = f_south_1/pop_south_1
gen sp_n_1 = sqrt(phat_n_1*(1 - phat_n_1)/pop_north_1)
gen sp_s_1 = sqrt(phat_s_1*(1 - phat_s_1)/pop_south_1)

egen fn_1 = mean(f_north_1)
egen fs_1 = mean(f_south_1)
egen pn_1 = mean(pop_north_1)
egen ps_1 = mean(pop_south_1)

gen phat_1 = (fn_1 + fs_1)/(pn_1 + ps_1)
gen qhat_1 = 1 - phat_1

gen sp_1 = sqrt(phat_1*qhat_1*(1/pn_1 + 1/ps_1))

gen z_1 = (fs_1/ps_1 - fn_1/pn_1)/sp_1

How can I test (p_hat_s_1 - p_hat_s_0) - (p_hat_n_1 - p_hat_n_0),
given that p_hat_* is a proportion?

My uninformed guess is that it might be ((p_hat_s_1 - p_hat_s_0) -
(p_hat_n_1 - p_hat_n_0)) / s,
where s = some weighted version of sp_0 and sp_1.

Many thanks,


*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index