Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Unexpected proportions after survey commands


From   luis <luis.ortiz@upf.edu>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   RE: st: Unexpected proportions after survey commands
Date   Sat, 9 May 2009 22:13:58 +0000 (GMT)

Enviado usando Real Mail de Vodafone.

-----Original Message-----

From: "Jean-Gael Collomb" <JG@ufl.edu>
Sent: Sat, 9 May 2009 17:13:57 -0400
To: statalist@hsphsun2.harvard.edu
Received:  9-May-2009 21:15:36 +0000
Subject: st: Unexpected proportions after survey commands

Hello all,

I have a question about using post stratification weights and using  
Stata's survey commands. After setting the weights, I do not get the  
proportions I expected.

My overall research question is to see if tourism (TOURIND) influences  
quality of life in several communities in a rural province of Namibia.  
My aim was to conduct individual interviews in a sample of 10% of all  
households in each community. I obtained household census counts from  
key informants within the community and my own double checks during  
field work.  This random sample yielded a random sample of 395  
interviews, of which only 9 (2.3%) were conducted with individuals  
working in tourism. Given this very low number of respondents who  
worked in tourism and my interest in trying to understand the impact  
of tourism, I established a sampling frame restricted to individuals  
working in tourism and interviewed 72 individuals. [Two of those  
interviews were conducted with individuals not employed in tourism but  
living in a household where someone was]. In total, I thus interviewed  
467 people, among which 79 worked in tourism. My full sample  
oversampled tourism employees and i think it would be wrong to derive  
from it that 17% (79/467*100) of the population works in tourism. I  
think Post stratification weights should be assigned to my data set to  
correct for the oversampling. In fact, the percentage of the  
population working in tourism varies by communities and thus different  
weights should be calculated for different communities. I used  
existing reports documenting total numbers of community residents  
employed by local tourism operators and total population size as a  
basis to calculate the "true" distribution of tourism employees  
(weight2). The weights were calculated by dividing the “true”  
percentage by the “oversampled” percentage.

The problem is that when I apply the weights in Stata, I do not get  
the proportion I expected. Specifically, I expected that after svyset  
_n [pweight = samplewt2] and svy: tab tourind, I would find that 0.84%  
of the population could be labeled TOURIND, but Stata returns a value  
of 3.25% (and similar discrepancies for each community).

I am not sure I am doing something wrong in calculating the weights,  
assigning the weights to my dataset, or entering the tab commands in  
svy mode. I’d greatly appreciate your help in helping move past this  
and take advantage of survey commands in Stata.

Thank you very much if you have time to give me some feedback or point  
me towards the best information source (textbook?).

Cheers,

Jean-Gael Collomb, jg@ufl.edu

(PS. I run Stata 10 in Mac OSX)



State code entered:

*ASSIGNING POST STRATIFICATION WEIGHTS

*-------------------------------------

gen samplewt2=0

label var samplewt2 "Post Stratification sample weight 2"

replace samplewt2=0.99975204562360500 if conservancy==1 & sample==1

replace samplewt2=0.04357333333333330 if conservancy==2 & sample==2

replace samplewt2=1.39197814207650000 if conservancy==2 & sample==1

replace samplewt2=0.10144078144078100 if conservancy==3 & sample==2

replace samplewt2=1.18320139407518000 if conservancy==3 & sample==1

replace samplewt2=0.05683908045977010 if conservancy==4 & sample==2

replace samplewt2=1.47985380116959000 if conservancy==4 & sample==1

replace samplewt2=0.01906976744186050 if conservancy==5 & sample==2

replace samplewt2=1.05030411449016000 if conservancy==5 & sample==1

tab tourind

bysort conservancy: tab tourind

*applying weight2 (those derived from IRDNC data)

svyset _n [pweight = samplewt2]

svy: tab tourind, percent



Jean-Gael "JG" Collomb

PhD candidate

School of Natural Resources and Environment / School of Forest  
Resources and Conservation

University of Florida

jgcollomb@gmail.com

jg@ufl.edu

+1 (352) 870 6696





  
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index