[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Clive Nicholas" <Clive.Nicholas@newcastle.ac.uk> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Interpretation of Dummy Variable Coefficients under Weighted Least-Squares |

Date |
Tue, 22 Nov 2005 02:31:40 -0000 (GMT) |

Jonathan DePeri wrote: > I am running a regression using sample weights on a model which > involves binary independent variables. The question may be > elementary, but how should I interpret the parameter estimate of > such a variable? Since the application of sampel weights transforms > the 0s and 1s of binary variables into 0s and values between 0 and > 1, it is not clear to me that the interpretation should remain the > same. It's not clear from your post whether or not you're using . reg y x1 x2 d1 [pw=weightvar] (or [aw=weightvar]) If you are, then it's more accurate to say that you are using weighted _ordinary_ least squares (WOLS), rather than WLS, which is conceptually (and statistically) different (see Winship and Radbill [1994: 241] for more). In any case, I think your intuition is correct: the interpretation _is_ the same. Others may want to put this is a more formal statistical language, but fitting a regression model with sampling weights simply adjusts the parameter estimates (and their standard errors) upwards or downwards for _all_ the independent variables in that model, and not just the continuous ones, in an effort to reduce the bias. Fitting two models using the 'Garrett and Mitchell' dataset (available on request) demonstrates this, in which I create -jobless- as a dummy variable from a continuous variable recording the unemployment rate. The variable -europe- is also a dummy: . g jobless=1 if unem<=5 (293 missing values generated) . recode jobless .=0 (jobless: 293 changes made) . g weight=invnorm(uniform()) . reg spend trade jobless growthpc europe Source | SS df MS Number of obs = 571 ----------+------------------------------ F( 4, 566) = 125.92 Model | 32430.9725 4 8107.74312 Prob > F = 0.0000 Residual | 36444.4887 566 64.3895561 R-squared = 0.4709 ----------+------------------------------ Adj R-squared = 0.4671 Total | 68875.4612 570 120.834142 Root MSE = 8.0243 --------------------------------------------------------------------------- spend | Coef. Std. Err. t P>|t| [95% Conf.Interval] ----------+---------------------------------------------------------------- trade | .1298481 .0147212 8.82 0.000 .1009332 .1587629 jobless | -5.381454 .7148419 -7.53 0.000 -6.785521 -3.977387 growthpc | -1.165842 .1431781 -8.14 0.000 -1.447067 -.8846165 europe | 6.994002 .9683677 7.22 0.000 5.091969 8.896035 _cons | 34.88938 .9811031 35.56 0.000 32.96233 36.81642 --------------------------------------------------------------------------- . reg spend trade jobless growthpc europe [pw=weight] (sum of wgt is 2.2378e+02) Linear regression Number of obs = 282 F( 4, 277) = 94.57 Prob > F = 0.0000 R-squared = 0.4746 Root MSE = 7.3868 --------------------------------------------------------------------------- | Robust spend | Coef. Std. Err. t P>|t| [95% Conf. Interval] ----------+---------------------------------------------------------------- trade | .1277293 .0168331 7.59 0.000 .0945923 .1608663 jobless | -5.427786 1.377935 -3.94 0.000 -8.140341 -2.715231 growthpc | -1.13484 .2256215 -5.03 0.000 -1.578991 -.6906894 europe | 6.744966 1.26138 5.35 0.000 4.261858 9.228074 _cons | 35.44936 .9949323 35.63 0.000 33.49077 37.40795 --------------------------------------------------------------------------- The weight I generated is, of course, nonsensical since this is a panel dataset of eighteen countries, but it illustrates my point. The standard errors change dramatically, but note how the parameter estimates don't change all that much (including the dummy variable, which shows that, although it's still significant post-weighting, its p-value 'purchase' is weaker: indeed, that's the story for all of the variables in this example). They've all simply been adjusted slightly after weighting. If your post was really asking about WLS, then you would find that if you fitted an OLS 'between-effects' model (-xtreg, be-) both with and without the -wls- option, you would find exactly the same thing. CLIVE NICHOLAS |t: 0(044)7903 397793 Politics |e: clive.nicholas@ncl.ac.uk Newcastle University |http://www.ncl.ac.uk/geps Reference: Winship C and Radbill L (1994) "Sampling Weights and Regression Analysis", SOCIOLOGICAL METHODS AND RESEARCH 23(2): 230-57. * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Interpretation of Dummy Variable Coefficients under Weighted Least-Squares***From:*Jonathan Michael DePeri <jmd2118@columbia.edu>

- Prev by Date:
**RE: st: Generate a new variable [four firm concentration ratio]** - Next by Date:
**st: (another) Stata 9/se windows problem** - Previous by thread:
**st: Interpretation of Dummy Variable Coefficients under Weighted Least-Squares** - Next by thread:
**st: Generate a new variable [four firm concentration ratio]** - Index(es):

© Copyright 1996–2017 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |