[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
sjsamuels@gmail.com |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Unexpected proportions after survey commands |

Date |
Sat, 9 May 2009 22:03:11 -0400 |

--- I meant: "A probability weight is the number of people represented by a sample member." On Sat, May 9, 2009 at 8:20 PM, <sjsamuels@gmail.com> wrote: > Jean-Gael: > > > A probability weight is the number of people represented by those in a > sample member. Your weights look nothing like numbers of people. In > your first sample, the HH probability weights (before non-response > adjustments) should be 10.0, because you took a 10% sample of HH. If > you interviewed every adult in the HH, they retain the HH weight. If > you interviewed 1/K in a household, the person weight is the HH weight > x K. > > It's not clear whether your frame of tourist workers (sample 2) was > of HH or people. If people, then you should be interviewing only > people who work in tourism, not their HH members--as HH members would > not have been in the frame. Since I don't know your sampling scheme, > I don't know how to compute the sampling weight. > > When you have 2 samples, as you did here, treat each one as coming > from a different stratum. Transfer the people in sample who work in > tourism to the 2nd stratum, and retain their original sampling weight. > > If villages are strata, then you have 2x10 = 20 sampling strata. > However it sounds like 10 villages are themselves a convenience > sample. If so, then keep the two samples as strata. Your PSU should > probably be HH. However if you interviewed only one person per HH, > then PSU can be person. > > After computing the sampling weights, you can, as Michael states, use > the -poststratify- option in Stata to reproduce the tourism counts. > Your post-stratification totals (tourism workers, non-tourism workers, > should add to the estimated population totals in the 10 villages; > 0.84% should be tourism workers, and 98.26% should be non-tourism > workers. If you want separate estimates of impact in each village, > then you can use the the villages to also define your post-strata: 10 > villages x 2 tourist-worker-status strata. > > Finally, unless one goal is to compare tourism and non-tourism > workers, it was not necessary to enhance your sample with tourism > workers. Tourism workers are obviously greatly affected by tourism, > compared to non-tourism workers. However, they constitute only 0.84% > of the population, so contribute minimally to the overall effects of > tourism on the population. > > if you need further assistance, the University of Florida has a number > of faculty with experience in survey sampling. > > -Steve > > > > On Sat, May 9, 2009 at 5:13 PM, Jean-Gael Collomb <JG@ufl.edu> wrote: >> Hello all, >> >> I have a question about using post stratification weights and using Stata's >> survey commands. After setting the weights, I do not get the proportions I >> expected. >> >> My overall research question is to see if tourism (TOURIND) influences >> quality of life in several communities in a rural province of Namibia. My >> aim was to conduct individual interviews in a sample of 10% of all >> households in each community. I obtained household census counts from key >> informants within the community and my own double checks during field work. >> This random sample yielded a random sample of 395 interviews, of which only >> 9 (2.3%) were conducted with individuals working in tourism. Given this very >> low number of respondents who worked in tourism and my interest in trying to >> understand the impact of tourism, I established a sampling frame restricted >> to individuals working in tourism and interviewed 72 individuals. [Two of >> those interviews were conducted with individuals not employed in tourism but >> living in a household where someone was]. In total, I thus interviewed 467 >> people, among which 79 worked in tourism. My full sample oversampled tourism >> employees and i think it would be wrong to derive from it that 17% >> (79/467*100) of the population works in tourism. I think Post stratification >> weights should be assigned to my data set to correct for the oversampling. >> In fact, the percentage of the population working in tourism varies by >> communities and thus different weights should be calculated for different >> communities. I used existing reports documenting total numbers of community >> residents employed by local tourism operators and total population size as a >> basis to calculate the "true" distribution of tourism employees (weight2). >> The weights were calculated by dividing the “true” percentage by the >> “oversampled” percentage. >> >> The problem is that when I apply the weights in Stata, I do not get the >> proportion I expected. Specifically, I expected that after svyset _n >> [pweight = samplewt2] and svy: tab tourind, I would find that 0.84% of the >> population could be labeled TOURIND, but Stata returns a value of 3.25% (and >> similar discrepancies for each community). >> >> I am not sure I am doing something wrong in calculating the weights, >> assigning the weights to my dataset, or entering the tab commands in svy >> mode. I’d greatly appreciate your help in helping move past this and take >> advantage of survey commands in Stata. >> >> Thank you very much if you have time to give me some feedback or point me >> towards the best information source (textbook?). >> >> Cheers, >> >> Jean-Gael Collomb, jg@ufl.edu >> >> (PS. I run Stata 10 in Mac OSX) >> >> >> >> State code entered: >> >> *ASSIGNING POST STRATIFICATION WEIGHTS >> >> *------------------------------------- >> >> gen samplewt2=0 >> >> label var samplewt2 "Post Stratification sample weight 2" >> >> replace samplewt2=0.99975204562360500 if conservancy==1 & sample==1 >> >> replace samplewt2=0.04357333333333330 if conservancy==2 & sample==2 >> >> replace samplewt2=1.39197814207650000 if conservancy==2 & sample==1 >> >> replace samplewt2=0.10144078144078100 if conservancy==3 & sample==2 >> >> replace samplewt2=1.18320139407518000 if conservancy==3 & sample==1 >> >> replace samplewt2=0.05683908045977010 if conservancy==4 & sample==2 >> >> replace samplewt2=1.47985380116959000 if conservancy==4 & sample==1 >> >> replace samplewt2=0.01906976744186050 if conservancy==5 & sample==2 >> >> replace samplewt2=1.05030411449016000 if conservancy==5 & sample==1 >> >> tab tourind >> >> bysort conservancy: tab tourind >> >> *applying weight2 (those derived from IRDNC data) >> >> svyset _n [pweight = samplewt2] >> >> svy: tab tourind, percent >> >> >> >> Jean-Gael "JG" Collomb >> >> PhD candidate >> >> School of Natural Resources and Environment / School of Forest Resources and >> Conservation >> >> University of Florida >> >> jgcollomb@gmail.com >> >> jg@ufl.edu >> >> +1 (352) 870 6696 >> >> >> >> >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ >> > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Unexpected proportions after survey commands***From:*Jean-Gael Collomb <JG@ufl.edu>

**Re: st: Unexpected proportions after survey commands***From:*sjsamuels@gmail.com

- Prev by Date:
**Re: st: Unexpected proportions after survey commands** - Next by Date:
**st: Re: Returning a p-value for simulation** - Previous by thread:
**Re: st: Unexpected proportions after survey commands** - Next by thread:
**RE: st: Unexpected proportions after survey commands** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |