Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: svyset command, pweight value?

From   Austin Nichols <>
To   "" <>
Subject   Re: st: svyset command, pweight value?
Date   Thu, 24 Oct 2013 09:59:23 -0400

Mikkel Høiberg <>:
The simplest pweight is 1/p, where p is the probability of selection.
So if the 1250 surveys were a simple random sample of the 360144
people, and all the 1250 surveys were returned you would define
g s1w=360144/1250
for "stage 1 weight" noting in passing that this is not called a dummy
variable. But you have nonresponse. If the 635 responses were a simple
random sample of the 1250 you could
g pw=360144/1250*1250/635
g pw=360144/635
but I doubt either "stage" is a simple random sample. Nonresponse, in
particular, is unlikely to be random.  You will want to model the
probability of nonresponse as a function of observable
characteristics, including age and geography and whatever else you
can, then adjust appropriately.  Read a book on survey methodology,
for starters.

On Thu, Oct 24, 2013 at 9:44 AM, Mikkel Høiberg <> wrote:
> Dear Stata listers,
> I am workning on survey data, where reveivers of the quesionnaire were
> drawn from the total Norwegian population, the total number of which
> is known.
> The number of questionnaires sent out were stratified by age-decade to
> ensure higher total number of responses in the youngest and eldest
> part of the population and to correct for expected lower response
> rates in these subgroups.
> As far as i understand, I am to use the svyset command for survey data
> to prevent falsely low standard deviations and thus false positive
> statistical associations.
> I intend to create a dummy variable for the svyset command.
> However: how should this dummy variable be constructed?
> As an example: for women between 40 and 50 years, the total female
> population in question is 360.114. 1250 questionnaires were sent out,
> 635 received.
> Could I code a dummy variable with the value 360144* 635/1250 =
> 182.935 for this agegroup?
> An then do the same calculations for other subgroups?
> Or should I use the inverse value?
> Or a third option?
> --
> Your help is very much appreciated!

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index