Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: pweight question

From   Steven Archambault <>
Subject   Re: st: pweight question
Date   Thu, 29 Apr 2010 19:01:14 -0600

"The scale of the weights (what they sum to) doesn't tell you whether
or not they are pweights. Scaling the variables to sum to the size of
the sample is something you do when you expect to use a package (or
command) that, like SPSS until recently, only accepts fweights."

Okay, now that I read this, the scaling down to the sample size does
make sense. The data was originally in an SPSS format. So,
essentially, scaling up would give the weights in terms of the
population of the country.  Correct?


On Thu, Apr 29, 2010 at 6:52 PM, Steven Archambault
<> wrote:
> Thanks for the responses thus far. I cannot say it is all clear to me
> now, but I am getting there.
> As for the strata and clustering, this is data that was to be taken as
> a representation of the population in several different "economic
> zones". The observations are taken from different villages in each
> zone. Actually, observations from each village have the exact same
> weight. I also know the population and area of the individual
> villages. I am assuming the "probability that an observation is in the
> sample" is based on the population density of that village or economic
> region. But, that isn't clear. Perhaps I could come up with my own
> weights retrospectively?
> I am also analyzing this for multilevel effects, using gllamm. So, I
> do expect the weights to matter.
> Any further guidance would be very helpful!
> Thanks,
> Steve
> On Thu, Apr 29, 2010 at 6:37 PM, Steve Samuels <> wrote:
>> I have other problems with these scaled weights.
>> First, if they are all you have, it is difficult to  identify  weights
>> that  are too  small. (Ken Brewer, Combined Survey Sampling Inference,
>> Wiley, p. 133).
>> Second, with these scaled weights one cannot recover the original ones
>> without information on the total, and the information is not always
>> available. In fact, for some samples, the population total isn't known
>> and the only estimate is based on the original probability weights.
>> Third, I wonder about the accuracy of the scaled weights.  If n is
>> moderate and  the sampling fraction is small, most of the significant
>> figures could be far to the right of the decimal place.
>> Finally, these weights just lead to confusion on the part of people
>> who were not in on their construction. The original poster was
>> confused on this occasion, and I was confused on another last year.
>> Steve
>> On Thu, Apr 29, 2010 at 5:47 PM, Stas Kolenikov <> wrote:
>>> On Thu, Apr 29, 2010 at 3:03 PM, Michael I. Lichter
>>> <> wrote:
>>>> The scale of the weights (what they sum to) doesn't tell you whether or not
>>>> they are pweights.
>>> That's not quite right. Properly scaled probability weights should sum
>>> up to the population size. This however is only relevant when you
>>> estimate -total-s. If you run pretty much any other analysis (means,
>>> ratios, proportions, any sort of regressions), then the scale of the
>>> weights cancels out. I would grind my teeth at the pweights that are
>>> scaled to the sample size, and maybe make some mental comments about
>>> the data provider, but won't be bothered very much by this nuisance.
>>> The scaling of the weights begins to matter again with multilevel
>>> data, in which the scaling is known to affect the accuracy of the
>>> variance component estimates.
>>> --
>> --
>> Steven Samuels
>> 18 Cantine's Island
>> Saugerties NY 12477
>> USA
>> Voice: 845-246-0774
>> Fax: 206-202-4783
>> *
>> *   For searches and help try:
>> *
>> *
>> *

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index