Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: The use of pweights with regress

From   "Mark Schaffer" <[email protected]>
To   [email protected]
Subject   Re: st: The use of pweights with regress
Date   Mon, 08 Nov 2004 18:19:35 -0000


A quick follow-up question:

From:           	[email protected] (Roberto G. Gutierrez, StataCorp)
To:             	[email protected]
Subject:        	Re: st: The use of pweights with regress
Date sent:      	Mon, 08 Nov 2004 11:33:27 -0600
Send reply to:  	[email protected]

> Mark Schaffer <[email protected]> writes:
> > In addition to the other advice you've gotten, a simple way to keep aweights
> > and pweights straight in your head is that (usually?  always?) aweights +
> > robust = pweights.  In any dataset, try (1) aweights on its own, (2)
> > aweights with robust, and (3) pweights.  You will usually (always?) find
> > that (2) and (3) give you the same output.
> aweights + robust is _usually_ equal to pweights, but that is more a
> computational coincidence than anything substantive.  One exception to this
> equivalence is -intreg-, where some extra work has to be done to honor the
> definition of aweights.
> By definition, aweights are for cell means data, i.e. data which have been 
> collapsed through averaging, and pweights are for sampling weights.

Is this true by definition, strictly speaking?  One reason for using 
aweights may be to do WLS.  We might have a view on the form 
heteroskedasticity takes and use WLS to eliminate it.  It might be 
caused by collapsing data through averaging, but there are other 
reasons it can arise.

This is another way to remember the difference bewteen aweights and 
pweights.  aweights can be used to remove, so to speak, 
heteroskedasticity.  pweights are supposed to help you reduce 
sampling biases in the coefficient estimates, but their use may 
actually introduce heteroskedasticity, and hence automatic use of the 
robust covariance estimator is triggered.  (I hope I got this right!)


> In most
> cases, because of the way observation-level likelihood contributions are
> weighted, you can treat a cell mean of 5 observations (aweight) equivalently
> to one observation which represents 5 population members (pweight) as long as
> you do a robust variance calculation inherent to the analysis of survey data.
> The apparent universal equivalence of the two, however, is part of the reason
> that mistakes can be made and articles written citing the misuse of weights.
> I would not recommend that someone use aweights + robust in lieu of pweights
> when the latter are unavailable for a particular command, or vice versa.  If a
> type of weighting is unavailable for a command, there is usually good reason,
> for example, the concept of an aweight for panel data is not well-defined.
> --Bobby
> [email protected]
> *
> *   For searches and help try:
> *
> *
> *

Prof. Mark E. Schaffer
Centre for Economic Reform and Transformation
Department of Economics
School of Management & Languages
Heriot-Watt University, Edinburgh EH14 4AS  UK
44-131-451-3494 direct
44-131-451-3008 fax
44-131-451-3485 CERT administrator

*   For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index