Bobby,

A quick follow-up question:

> Mark Schaffer <[email protected]> writes:
>
> > In addition to the other advice you've gotten, a simple way to keep aweights
> > and pweights straight in your head is that (usually?  always?) aweights +
> > robust = pweights.  In any dataset, try (1) aweights on its own, (2)
> > aweights with robust, and (3) pweights.  You will usually (always?) find
> > that (2) and (3) give you the same output.
>
> aweights + robust is _usually_ equal to pweights, but that is more a
> computational coincidence than anything substantive.  One exception to this
> equivalence is -intreg-, where some extra work has to be done to honor the
> definition of aweights.
>
> By definition, aweights are for cell means data, i.e. data which have been
> collapsed through averaging, and pweights are for sampling weights.

Is this true by definition, strictly speaking?  One reason for using
aweights may be to do WLS.  We might have a view on the form
heteroskedasticity takes and use WLS to eliminate it.  It might be
caused by collapsing data through averaging, but there are other
reasons it can arise.

This is another way to remember the difference bewteen aweights and
pweights.  aweights can be used to remove, so to speak,
sampling biases in the coefficient estimates, but their use may
actually introduce heteroskedasticity, and hence automatic use of the
robust covariance estimator is triggered.  (I hope I got this right!)

--Mark

> In most
> cases, because of the way observation-level likelihood contributions are
> weighted, you can treat a cell mean of 5 observations (aweight) equivalently
> to one observation which represents 5 population members (pweight) as long as
> you do a robust variance calculation inherent to the analysis of survey data.
>
> The apparent universal equivalence of the two, however, is part of the reason
> that mistakes can be made and articles written citing the misuse of weights.
>
> I would not recommend that someone use aweights + robust in lieu of pweights
> when the latter are unavailable for a particular command, or vice versa.  If a
> type of weighting is unavailable for a command, there is usually good reason,
> for example, the concept of an aweight for panel data is not well-defined.
>
> --Bobby
> [email protected]
