[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: question regarding the past practice of survey data analysis

From	Richard Williams <[email protected]>
To	[email protected]
Subject	Re: st: question regarding the past practice of survey data analysis
Date	Sun, 23 Nov 2003 17:14:17 -0500

At 11:01 PM 11/23/2003 +0900, Yukio MAEDA wrote:

Then, my question is whether ignoring weight (for regression) is a
common practice in some part of the discipline in the past or not.  And
if so, I would like to know if there is any paper or reference that
explains why one can ignore weight for regression but cannot for
descriptive inference.

We had a discussion about this in my department a while back. Some of the things we came up with (and I won't swear to their correctness) is that, IF the model is correctly specified, failure to weight will

* Produce incorrect descriptive statistics (e.g. suppose you oversampled low-income minorities; estimates of mean income for the whole population would be too low if you did not adjust for that)

* Estimates of unstandardized (aka metric) coefficients would be unbiased. Further, people who had experimented with weighting and not weighting found that, with their data sets at least, the estimates differed little whether you weighted or not.

* Estimates of standardized coefficients and R-square would likely be wrong if the weighting was wrong. So, if you like to emphasize such things in your writing, be very careful about this. I illustrate this in a handout currently located at http://www.nd.edu/~rwilliam/xsoc593/homework/hw08ak.pdf. Location will change in a few weeks when I update my course notes.

* Standard errors, confidence intervals and significance tests will be affected. However, there was some dispute over whether or not this was a bad effect; if, say, you've oversampled certain racial groups, then the estimates of the standard errors for race dummies may be too high because the weighted Ns for the minorities will be smaller than the true Ns. Conversely, I would think that if you're not sure how accurate your significance tests are, how confident can you be that the model is correctly specified? And even if it is correctly specified, isn't it still important to know how big the confidence intervals are? I'll add that I am further confused, wondering whether the weighting and survey schemes available in Stata might not be able to adjust for these problems. I'm just not familiar enough with them yet.

* One strategy might be to do things both with and without weights, and see if it makes much difference. If the results are very different, it might cast doubt on your belief that the model is correctly specified.

I think the point about standard errors and significance tests is the most critical one, and it is also the point I myself am most fuzzy about, so I too would be interested in what others have to say.

-------------------------------------------
Richard Williams, Associate Professor
OFFICE: (574)631-6668, (574)631-6463
FAX: (574)288-4373
HOME: (574)289-5227
EMAIL: [email protected]
WWW (personal): http://www.nd.edu/~rwilliam
WWW (department): http://www.nd.edu/~soc

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- st: Using -if- when running panel-corrected OLS pooled regressions
  - From: "Clive Nicholas" <[email protected]>

References:
- st: -lambda- on SSC
  - From: "Nick Cox" <[email protected]>
- st: question regarding the past practice of survey data analysis
  - From: Yukio MAEDA <[email protected]>

Prev by Date: st: question regarding the past practice of survey data analysis
Next by Date: st: RE: RE: generating a moving sum
Previous by thread: st: question regarding the past practice of survey data analysis
Next by thread: st: Using -if- when running panel-corrected OLS pooled regressions
Index(es):
- Date
- Thread