[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Clive Nicholas" <[email protected]> |

To |
[email protected] |

Subject |
Re: st: regression models with small number of clusters |

Date |
Sat, 8 Jan 2005 02:19:11 -0000 (GMT) |

Peter Muhlberger replied to Krishna D Rao: > You could also try bootstrapping your coefficients from a random effects > model, which would eliminate the small sample bias in your variance > estimates. A nice idea to Krishna's original poser, which I've been able to simulate whilst incorporating Roger Newson's suggestion to fit a fixed effects model to his data. I've followed Wood's (2004) suggestion in running 2000 bootstrapped simulations. Since, as Roger points out, Krishna doesn't give us any detailed information on his variables, I've assumed that the response variable in the dataset simulated below is a uniformly distributed and continuous variable ranged from 0-100: . clear . set more off . set seed `=date("2005-01-07", "ymd")' . set obs 360 obs was 0, now 360 . g id=_n . g group=ceil(uniform()*12) . tab group group | Freq. Percent Cum. ------------+----------------------------------- 1 | 50 6.94 6.94 2 | 72 10.00 16.94 3 | 82 11.39 28.33 4 | 60 8.33 36.67 5 | 54 7.50 44.17 6 | 58 8.06 52.22 7 | 60 8.33 60.56 8 | 50 6.94 67.50 9 | 62 8.61 76.11 10 | 54 7.50 83.61 11 | 54 7.50 91.11 12 | 64 8.89 100.00 ------------+----------------------------------- Total | 720 100.00 . expand 2 (360 observations created) . g y=uniform()*100 . g x1=uniform() . g x2=uniform()*5 . g x3=invnorm(uniform()) . by id: gen time=_n . sort id . l id group y x1 x2 x3 time in 1/10 +----------------------------------------------------------------+ | id group y x1 x2 x3 time | |----------------------------------------------------------------| 1. | 1 6 24.03016 .1445585 .6673999 -1.081646 1 | 2. | 1 6 90.45777 .6376978 3.680331 -1.410077 2 | 3. | 2 3 12.22887 .989752 1.70654 -1.028015 1 | 4. | 2 3 95.63952 .6426014 4.66782 -.621906 2 | 5. | 3 11 20.20495 .9287896 4.912792 -1.984773 1 | |----------------------------------------------------------------| 6. | 3 11 57.8842 .6628636 3.113226 -.3708619 2 | 7. | 4 10 50.49711 .3376878 2.095944 .2773025 1 | 8. | 4 10 12.29717 .9934924 .8407423 .6124281 2 | 9. | 5 12 88.63429 .2615153 .8085947 .1702638 1 | 10. | 5 12 31.2398 .1373881 .556083 .4438728 2 | +----------------------------------------------------------------+ . tsset id time . bs "areg y x1 x2 x3, absorb(id) cluster(group)" _b _se, size(360) reps(2000) saving(kris) dots command: areg y x1 x2 x3 , absorb(id) cluster(group) statistics: b_x1 = _b[x1] b_x2 = _b[x2] b_x3 = _b[x3] b_cons = _b[_cons] se_x1 = _se[x1] se_x2 = _se[x2] se_x3 = _se[x3] se_cons = _se[_cons] [...] Bootstrap statistics Number of obs = 720 Replications = 2000 ---------------------------------------------------------------------------- Variable | Reps Observed Bias Std. Err. [95% Conf. Interval] -----------+---------------------------------------------------------------- b_x1 | 2000 -4.629344 -.0557279 11.65842 -27.49327 18.23458 (N) | -28.52434 17.79685 (P) | -28.4749 17.84085 (BC) b_x2 | 2000 1.275952 -.0159533 2.563146 -3.750765 6.302669 (N) | -3.645857 6.343257 (P) | -3.524056 6.556383 (BC) b_x3 | 2000 .5331025 .1028936 3.621918 -6.570027 7.636232 (N) | -6.388688 7.618769 (P) | -6.421975 7.581013 (BC) b_cons | 2000 49.81735 .001473 8.255104 33.62784 66.00686 (N) | 33.13453 65.7711 (P) | 32.77082 65.22783 (BC) se_x1 | 2000 8.710757 11.37133 5.44168 -1.961202 19.38272 (N) | 11.2756 32.2606 (P) | 5.797402 5.797402 (BC) se_x2 | 2000 1.648332 2.705572 1.060601 -.4316664 3.72833 (N) | 2.447499 6.681056 (P) | 1.554579 1.554579 (BC) se_x3 | 2000 2.375892 3.526444 1.52015 -.6053518 5.357137 (N) | 3.307063 9.179192 (P) | 1.835775 1.835775 (BC) se_cons | 2000 5.117283 8.791256 3.831319 -2.396513 12.63108 (N) | 7.339995 22.68564 (P) | 4.573055 4.573055 (BC) ---------------------------------------------------------------------------- Note: N = normal P = percentile BC = bias-corrected In order to fire up -areg-, I was forced to take the liberty of -expand-ing the dataset by at least 2 and creating a -time- variable (thus simulating repeated observations for each individual; I've also assumed that this panel dataset is balanced). Otherwise, -areg- returns an "insufficient observations r(2000)" error. Note that in order to control for fixed effects at different levels, both -absorb- and -cluster- should be switched on for the _individual_ and _group_ fixed effects respectively. Unfortunately, I cannot simulate a dependent variable which induces heteroscedasticity, but this example should now give Krishna enough ammunition to solve his dilemma. CLIVE NICHOLAS |t: 0(044)7903 397793 Politics |e: [email protected] Newcastle University |http://www.ncl.ac.uk/geps Reference: Wood M (2004) "Statistical Inference Using Bootstrap Confidence Intervals" SIGNIFICANCE 1(4): 180-2. * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: regression models with small number of clusters***From:*"Clive Nicholas" <[email protected]>

**References**:**Re: st: regression models with small number of clusters***From:*Peter Muhlberger <[email protected]>

- Prev by Date:
**st: sjlog close** - Next by Date:
**st: Re: Crowded axis title and labels** - Previous by thread:
**Re: st: regression models with small number of clusters** - Next by thread:
**Re: st: regression models with small number of clusters** - Index(es):

© Copyright 1996–2024 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |