[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Austin Nichols <austinnichols@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Sample selection and endogeneity (or, combining heckman and ivreg) |

Date |
Thu, 6 Aug 2009 12:21:23 -0400 |

Adrian <kokootchke@hotmail.com> : For GLM and GMM, you can read the Stata 11 manual entries for -gmm- and -glm- and refs cited therein, or Cameron and Trivedi (two books available from Stata's bookstore). You can run a simulation for some independent normally distributed X variables and get one result, then run for some data that looks like yours and get a totally different result, so it makes sense to use data that looks like yours (same covariance structure)--it's easiest to just start with your data, and modify it as needed. The modifications would be: you specify the errors and the coefficients, so you know the true relationship between X and y, then you try to estimate it. The simulation comes in because you specify distributions for error terms and then you draw all the error terms needed 100 times, or (better) 10000 times, to assess the distribution of estimated coefs around true coefs, and rejection rates. For example (note I don't have your data, so I start by making data up with -drawnorm-): clear all prog pheck, rclass syntax [, Corr(real .1) ] matrix C = (1, `corr' \ `corr' , 1) drawnorm u v, n(2400) corr(C) clear g long i=mod(_n-1,60)+1 egen mv=mean(v), by(i) forv i=2/5 { g x`i'=rnormal() } g x1=mv+x2+rnormal() g y1=(-x1/5-x3/5+u>0) g y2star=(y1/5+x1/5+x4/5+x5/5+v) g s=(v+x1/5+x2/5+x3/5>0) g y2=y2star if s reg y2 y1 x1 x4 x5, cluster(i) foreach v of varlist y1 x1 x4 x5 { return scalar rb_`v'=_b[`v'] return scalar rs_`v'=_se[`v'] } test x1=.2 return scalar rrej_x1=(r(p)<.05) probit y1 x1 x3 predict double xbeta1, xb predict p gen double im=normalden(xb)/normprob(xb) if y1==1 replace im=-normalden(xb)/(1-normprob(xb)) if y1==0 heckman y2 y1 x1 x4 x5 im, sel(x1 x2 im) cluster(i) iterate(1000) if e(cmd) == "heckman" { if e(converged) == 1 { foreach v of varlist y1 x1 x4 x5 { return scalar hb_`v'=_b[`v'] return scalar hs_`v'=_se[`v'] } test x1=.2 return scalar hrej_x1=(r(p)<.05) } } ivreg2 y2 (y1 x1=p x2 x3) x4 x5, gmm2s cluster(i) foreach v of varlist y1 x1 x4 x5 { return scalar ib_`v'=_b[`v'] return scalar is_`v'=_se[`v'] } test x1=.2 return scalar irej_x1=(r(p)<.05) eret clear end set seed 1 pheck simul,rep(100):pheck tw kdensity ib_x1 || kdensity hb_x1 || kdensity rb_x1, xli(.2) su *b_x1 *rej* *b_y1, sep(3) Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- ib_x1 | 100 .1538109 .0398201 .08462 .2944242 hb_x1 | 58 .4007295 .0309691 .3406418 .499454 rb_x1 | 100 .0452054 .0146647 .0097073 .1013072 -------------+-------------------------------------------------------- irej_x1 | 100 .43 .4975699 0 1 hrej_x1 | 58 1 0 1 1 rrej_x1 | 100 1 0 1 1 -------------+-------------------------------------------------------- ib_y1 | 100 1.686989 .4401385 .9865659 3.472792 hb_y1 | 58 2.718954 .3095008 2.231136 3.557839 rb_y1 | 100 .2975349 .0352731 .2353329 .3744754 In this example, IV gets close to the true coef on x1 of 0.2 but overrejects by a huge margin (IV typically has a fraction of the OLS bias in finite samples), while both OLS and the ad hoc method using -heckman- do a terrible job (and -heckman- doesn't converge inside 1000 iterations in many cases, so the code takes forever to run). OLS looks better than IV and the ad hoc method for the coef on y1, but none of the methods performs adequately. For your case, I would forget about the selection problem, and run some panel data model with instruments. If you want to take a "more correct" GMM approach and stack equations for the count of number of bonds issued in a period and equations for spreads (or log-spreads) on those bonds, you will need to find a coauthor, I suspect. But the -gmm- command in Stata 11 will help, probably. On Thu, Aug 6, 2009 at 2:30 AM, kokootchke<kokootchke@hotmail.com> wrote: > Austin, thank you very much for your response. I agree that not having > a reference would weaken my results and this is why I'm trying to see > if someone in this Stata group can point in the right direction. I have > thought about the simulations as well and I'm contemplating doing that, > but I've never done this before and would like some pointers as to > where I should start. Would you have any suggestions or do you have a > reference that could help in that regard? Also, what do you mean by "for samples that look like yours"? > This is a very good point. I have also never used GLM/GMM in this context before, so could you please be more specific regarding what I need to know or where I should look in order to consider this option and try to implement it? * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**RE: st: Sample selection and endogeneity (or, combining heckman and ivreg)***From:*kokootchke <kokootchke@hotmail.com>

**References**:**st: Poisson vs. Linear regression for comparing rates***From:*Ashwin Ananthakrishnan <ashwinna@yahoo.com>

**Re: st: Poisson vs. Linear regression for comparing rates***From:*Maarten buis <maartenbuis@yahoo.co.uk>

**st: Sample selection and endogeneity (or, combining heckman and ivreg)***From:*kokootchke <kokootchke@hotmail.com>

**Re: st: Sample selection and endogeneity (or, combining heckman and ivreg)***From:*John Antonakis <john.antonakis@unil.ch>

**Re: st: Sample selection and endogeneity (or, combining heckman and ivreg)***From:*Shehzad Ali <drshehzad_ali@yahoo.com>

**Re: st: Sample selection and endogeneity (or, combining heckman and ivreg)***From:*Austin Nichols <austinnichols@gmail.com>

**RE: st: Sample selection and endogeneity (or, combining heckman and ivreg)***From:*kokootchke <kokootchke@hotmail.com>

- Prev by Date:
**Re: st: Is xtivreg2 appropriate with a small dynamic panel?** - Next by Date:
**Re: st: AW: "skipping" missing data** - Previous by thread:
**RE: st: Sample selection and endogeneity (or, combining heckman and ivreg)** - Next by thread:
**RE: st: Sample selection and endogeneity (or, combining heckman and ivreg)** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |