Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.

# Re: st: sampling weight

 From Stas Kolenikov To statalist@hsphsun2.harvard.edu Subject Re: st: sampling weight Date Wed, 26 Sep 2012 22:41:16 -0500

```These are steps in the right direction. Please describe your sampling
design in full detail, so that we could brainstorm and see what the
right specifications should be.

--
-- Stas Kolenikov, PhD, PStat (SSC)  ::  http://stas.kolenikov.name
-- Senior Survey Statistician, Abt SRBI  ::  work email kolenikovs at
srbi dot com
-- Opinions stated in this email are mine only, and do not reflect the
position of my employer

On Wed, Sep 26, 2012 at 9:58 PM, Lynn Lee <lynn09v@gmail.com> wrote:
> Dear Stas,
>
> I just want to do simple sampling.
>
> Take "webuse total" for example. I am wondering how was "swgt" generated? I
> guess: obs 1 has her corresponding sampling weight, swgt=25964, which is the
> total population in her group; obs 4 has his corresponding sampling weight,
> swgt=4312, which is the total population in his group;etc.  Is that right?
>
> So, if I use this logic in my downloaded survey data sets, I can group all
> the obs into different sampling weight over residence place and gender.
> Like: I calculate total number of individuals who were in the dataset
> according to their resident city , say, total number of individuals in city
> 1 is 1000 in dataset, total number of individuals in city n is 400 in the
> data set, then, I generate this city-total-individuals as a new variable
> (weight). (Or I can even be more detailed, total number of people in the
> data set over city, gender, age.) In regression, I simply use command "reg y
> x1 x2 x3 [pweight=total]". Can this way correct in part for unweighted data
> set?
>
> Suppose the mean of total(weights) is 500, min is 100 and max is 800.Then,
> weighted analysis will give at most 800/100 times the weights to potentially
> under-sampled observations. Do I understand correctly?
>
>
> Best Regards,
> Lynn Lee
>
> -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu
> [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Stas Kolenikov
> Sent: Wednesday, September 26, 2012 9:44 PM
> To: statalist@hsphsun2.harvard.edu
> Subject: Re: st: sampling weight
>
> If Lynn obtained her sample in a rigorous way by enumerating the dwellings,
> she should have all the inputs into the probability of selection, and the
> baseline sampling weight is the inverse of that.
> Then she would want to correct for non-response, which would be the fraction
> of those responding to the survey among those sampled.
>
> If Lynn is interested in a specific population (females of reproductive age,
> say), and that's who the survey collected the data on, then she would need
> to get the total population counts for that specific population (which may
> prove even more difficult).
>
> If she does not have these figures, then I don't really know what to do. As
> they say, when you approach a statistician with collected data in hand, they
> can only tell you what killed your study.
>
> --
> -- Stas Kolenikov, PhD, PStat (SSC)  ::  http://stas.kolenikov.name
> -- Senior Survey Statistician, Abt SRBI  ::  work email kolenikovs at srbi
> dot com
> -- Opinions stated in this email are mine only, and do not reflect the
> position of my employer
>
>
>
> On Wed, Sep 26, 2012 at 8:15 AM, JVerkuilen (Gmail)
> <jvverkuilen@gmail.com> wrote:
>> On Wed, Sep 26, 2012 at 2:49 AM, Lynn Lee <lynn09v@gmail.com> wrote:
>>
>>> Any suggestion to suggest which weight is better? Or, other types of
> weights
>>> may be better than population weights?
>>
>> Do you have a few accurately observed variables such as the population
>> age and gender breakdown? If so you can often create
>> post-stratification weights (through a process called "raking") that
>> make your samples align with the associations observed in those
>> tables.
>>
>> A quick -findit raking- turned up a program -ipfraking- written by
>> Stas Kolenikov and available from his website. Hopefully he'll chime
>> in.
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```