[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Misha Spisok <misha.spisok@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Difference in Difference for Proportions |

Date |
Tue, 22 Sep 2009 14:49:40 -0700 |

Many Thanks, Austin and Jeph! The Norton, Wang, and Ai SJ article was very informative. Also, the code examples clarified some things and, of course, raised more questions. If it is not kosher to post follow-up questions on the same thread, please let me know and I will re-post as new questions. Otherwise, my follow-up questions are below. The short version is, what's the difference between -blogit- and -logit-? Or, more accurately, in the context of grouped data, which standard error estimate is correct? If, after using Austin's example, I run the following: logit union t south txsouth [pw=f] and blogit y pop t south txsouth I get, as expected (or hoped, in my case), the same coefficients. The standard errors are smaller in -blogit- because, as I might understand, -blogit- is considering pop to be the number of observations per row, so the number of "effective" observations is the sum of pop. I think this explains the difference in the standard errors. Specifically, with some minor adjustment for the "robustified" -logit- standard errors, the relationship between -logit- and -blogit- standard errors is something like the following: s_blogit = sqrt(s_logit^2*(n_logit - k)/(n_blogit - k)) where s_blogit is the se from -blogit-, s_logit is the se from -logit-, n_logit is the number of observation from -logit-, n_blogit is the number of observations from -blogit-, and k is the number of dependent variables, including the constant. It strikes me that the standard errors from -blogit- are more reasonable, given the actual number of observations that lie behind the summarized data. Thus, it seems that the standard errors from using -inteff- will be as incorrect as those from -logit- for summarized data. While I could use the formula from Ai and Norton (2003) to calculate the standard error for the interaction term using the variance-covariance matrix returned after -blogit-, would this be making a mistake? My data are not survey data. They are "actual" data, in the sense that f is the true number of people with the condition and pop is the true population. Thanks again, Misha (Using Stata 10.1) On Fri, Sep 18, 2009 at 7:00 AM, Jeph Herrin <junk@spandrel.net> wrote: > > Thanks Austin, > > Yes, I should have specified the -rd- option, I meant > the linear link function. I've become a fan of using > binary (and binomial) linear regression for testing > hypotheses. > > cheers, > Jeph > > > Austin Nichols wrote: >> >> Jeph-- >> Doesn't the interaction problem discussed in >> http://www.stata-journal.com/sjpdf.html?articlenum=st0063 >> also rear its ugly head here? >> >> Probably also have to be careful of SEs--if the total populations are >> summed weights from a survey, significance will likely be overstated. >> >> I'd probably go to -svy:tab- first in that case... >> >> sysuse psidextract, clear >> keep if t>5 >> set seed 1 >> g f=ceil(uniform()*1000) >> egen pop=total(f), by(south t) >> svyset [pw=f], strata(t) >> egen gp=group(t south), lab >> svy:tab gp union if t>5, row ci >> lincom _b[p42]-_b[p22]-(_b[p32]-_b[p12]) >> g txsouth=t*south >> egen y=total(union*f), by(gp) >> bys gp: replace y=. if _n<_N >> li y t south pop if y<. >> binreg y t south txsouth, n(pop) >> binreg union t south txsouth [pw=f] >> logit union t south txsouth [pw=f] >> findit inteff >> >> On Thu, Sep 17, 2009 at 4:53 PM, Jeph Herrin <junk@spandrel.net> wrote: >>> >>> Not sure whether this helps you, but I would normally test this >>> with an interaction term in a model. For instance >>> >>> gen txsouth=t*south >>> binreg f t south txsouth, n(pop) >>> >>> Then testing the coefficient on -txsouth- is the same as >>> testing whether there is a significant difference in differences. >>> >>> hth, >>> Jeph >>> >>> Misha Spisok wrote: >>>> >>>> Hello, Statalist, >>>> >>>> In brief, how does one test a difference in difference of proportions? >>>> My question is re-stated briefly at the end with reference to the >>>> variables I present. A formula and/or reference would be appreciated >>>> if no command exists. >>>> >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ >> > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Difference in Difference for Proportions***From:*Austin Nichols <austinnichols@gmail.com>

**References**:**st: Difference in Difference for Proportions***From:*Misha Spisok <misha.spisok@gmail.com>

**Re: st: Difference in Difference for Proportions***From:*Jeph Herrin <junk@spandrel.net>

**Re: st: Difference in Difference for Proportions***From:*Austin Nichols <austinnichols@gmail.com>

**Re: st: Difference in Difference for Proportions***From:*Jeph Herrin <junk@spandrel.net>

- Prev by Date:
**Re: st: RE: -smcl2ps- page break?** - Next by Date:
**st: -graph bar- label problem** - Previous by thread:
**Re: st: Difference in Difference for Proportions** - Next by thread:
**Re: st: Difference in Difference for Proportions** - Index(es):

© Copyright 1996–2019 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |