Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Stas Kolenikov <skolenik@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: sampling weight |

Date |
Wed, 26 Sep 2012 22:41:16 -0500 |

These are steps in the right direction. Please describe your sampling design in full detail, so that we could brainstorm and see what the right specifications should be. -- -- Stas Kolenikov, PhD, PStat (SSC) :: http://stas.kolenikov.name -- Senior Survey Statistician, Abt SRBI :: work email kolenikovs at srbi dot com -- Opinions stated in this email are mine only, and do not reflect the position of my employer On Wed, Sep 26, 2012 at 9:58 PM, Lynn Lee <lynn09v@gmail.com> wrote: > Dear Stas, > > I just want to do simple sampling. > > Take "webuse total" for example. I am wondering how was "swgt" generated? I > guess: obs 1 has her corresponding sampling weight, swgt=25964, which is the > total population in her group; obs 4 has his corresponding sampling weight, > swgt=4312, which is the total population in his group;etc. Is that right? > > So, if I use this logic in my downloaded survey data sets, I can group all > the obs into different sampling weight over residence place and gender. > Like: I calculate total number of individuals who were in the dataset > according to their resident city , say, total number of individuals in city > 1 is 1000 in dataset, total number of individuals in city n is 400 in the > data set, then, I generate this city-total-individuals as a new variable > (weight). (Or I can even be more detailed, total number of people in the > data set over city, gender, age.) In regression, I simply use command "reg y > x1 x2 x3 [pweight=total]". Can this way correct in part for unweighted data > set? > > Suppose the mean of total(weights) is 500, min is 100 and max is 800.Then, > weighted analysis will give at most 800/100 times the weights to potentially > under-sampled observations. Do I understand correctly? > > I appreciate for your suggestion in advance. > > Best Regards, > Lynn Lee > > -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu > [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Stas Kolenikov > Sent: Wednesday, September 26, 2012 9:44 PM > To: statalist@hsphsun2.harvard.edu > Subject: Re: st: sampling weight > > If Lynn obtained her sample in a rigorous way by enumerating the dwellings, > she should have all the inputs into the probability of selection, and the > baseline sampling weight is the inverse of that. > Then she would want to correct for non-response, which would be the fraction > of those responding to the survey among those sampled. > > If Lynn is interested in a specific population (females of reproductive age, > say), and that's who the survey collected the data on, then she would need > to get the total population counts for that specific population (which may > prove even more difficult). > > If she does not have these figures, then I don't really know what to do. As > they say, when you approach a statistician with collected data in hand, they > can only tell you what killed your study. > > -- > -- Stas Kolenikov, PhD, PStat (SSC) :: http://stas.kolenikov.name > -- Senior Survey Statistician, Abt SRBI :: work email kolenikovs at srbi > dot com > -- Opinions stated in this email are mine only, and do not reflect the > position of my employer > > > > On Wed, Sep 26, 2012 at 8:15 AM, JVerkuilen (Gmail) > <jvverkuilen@gmail.com> wrote: >> On Wed, Sep 26, 2012 at 2:49 AM, Lynn Lee <lynn09v@gmail.com> wrote: >> >>> Any suggestion to suggest which weight is better? Or, other types of > weights >>> may be better than population weights? >> >> Do you have a few accurately observed variables such as the population >> age and gender breakdown? If so you can often create >> post-stratification weights (through a process called "raking") that >> make your samples align with the associations observed in those >> tables. >> >> A quick -findit raking- turned up a program -ipfraking- written by >> Stas Kolenikov and available from his website. Hopefully he'll chime >> in. >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**st: sampling weight***From:*"Lynn Lee" <lynn09v@gmail.com>

**References**:**st: sampling weight***From:*"Lynn Lee" <lynn09v@gmail.com>

**Re: st: sampling weight***From:*"JVerkuilen (Gmail)" <jvverkuilen@gmail.com>

**Re: st: sampling weight***From:*Stas Kolenikov <skolenik@gmail.com>

**st: sampling weight***From:*"Lynn Lee" <lynn09v@gmail.com>

- Prev by Date:
**Re: st: Omitting fixed effects dummies output from esttab tables** - Next by Date:
**Re: st: Omitting fixed effects dummies output from esttab tables** - Previous by thread:
**st: sampling weight** - Next by thread:
**st: sampling weight** - Index(es):