[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Unexpected proportions after survey commands

From	"Michael I. Lichter" <[email protected]>
To	[email protected]
Subject	Re: st: Unexpected proportions after survey commands
Date	Sat, 09 May 2009 18:26:52 -0400

Jean-Gael,

First of all, you can use Stata's svy poststratification weights (asopposed to probability weights) by creating a variable that has celltotals for each poststratification cell. E.g,


TOURIND Community #
Yes     1         10
No      1         80
Yes     2         5
No      2         92

etc. You then use the -svyset- syntax for poststratification strata andweights.

Alternatively, what you said about "true" vs. "sample" proportionssounds fine, but you might not have done it correctly in practice. Youneed to assign weights to both the tourism and non-tourism parts of thesample. Also, you shouldn't use falsely high levels of precision--stickwith 2-4 digits.


Michael

Jean-Gael Collomb wrote:

Hello all,
I have a question about using post stratification weights and usingStata's survey commands. After setting the weights, I do not get theproportions I expected.
My overall research question is to see if tourism (TOURIND) influencesquality of life in several communities in a rural province of Namibia.My aim was to conduct individual interviews in a sample of 10% of allhouseholds in each community. I obtained household census counts fromkey informants within the community and my own double checks duringfield work. This random sample yielded a random sample of 395interviews, of which only 9 (2.3%) were conducted with individualsworking in tourism. Given this very low number of respondents whoworked in tourism and my interest in trying to understand the impactof tourism, I established a sampling frame restricted to individualsworking in tourism and interviewed 72 individuals. [Two of thoseinterviews were conducted with individuals not employed in tourism butliving in a household where someone was]. In total, I thus interviewed467 people, among which 79 worked in tourism. My full sampleoversampled tourism employees and i think it would be wrong to derivefrom it that 17% (79/467*100) of the population works in tourism. Ithink Post stratification weights should be assigned to my data set tocorrect for the oversampling. In fact, the percentage of thepopulation working in tourism varies by communities and thus differentweights should be calculated for different communities. I usedexisting reports documenting total numbers of community residentsemployed by local tourism operators and total population size as abasis to calculate the "true" distribution of tourism employees(weight2). The weights were calculated by dividing the “true”percentage by the “oversampled” percentage.
The problem is that when I apply the weights in Stata, I do not getthe proportion I expected. Specifically, I expected that after svyset_n [pweight = samplewt2] and svy: tab tourind, I would find that 0.84%of the population could be labeled TOURIND, but Stata returns a valueof 3.25% (and similar discrepancies for each community).
I am not sure I am doing something wrong in calculating the weights,assigning the weights to my dataset, or entering the tab commands insvy mode. I’d greatly appreciate your help in helping move past thisand take advantage of survey commands in Stata.
Thank you very much if you have time to give me some feedback or pointme towards the best information source (textbook?).
Cheers,

Jean-Gael Collomb, [email protected]

(PS. I run Stata 10 in Mac OSX)



State code entered:

*ASSIGNING POST STRATIFICATION WEIGHTS

*-------------------------------------

gen samplewt2=0

label var samplewt2 "Post Stratification sample weight 2"

replace samplewt2=0.99975204562360500 if conservancy==1 & sample==1

replace samplewt2=0.04357333333333330 if conservancy==2 & sample==2

replace samplewt2=1.39197814207650000 if conservancy==2 & sample==1

replace samplewt2=0.10144078144078100 if conservancy==3 & sample==2

replace samplewt2=1.18320139407518000 if conservancy==3 & sample==1

replace samplewt2=0.05683908045977010 if conservancy==4 & sample==2

replace samplewt2=1.47985380116959000 if conservancy==4 & sample==1

replace samplewt2=0.01906976744186050 if conservancy==5 & sample==2

replace samplewt2=1.05030411449016000 if conservancy==5 & sample==1

tab tourind

bysort conservancy: tab tourind

*applying weight2 (those derived from IRDNC data)

svyset _n [pweight = samplewt2]

svy: tab tourind, percent



Jean-Gael "JG" Collomb

PhD candidate
School of Natural Resources and Environment / School of ForestResources and Conservation
University of Florida

[email protected]

[email protected]

+1 (352) 870 6696
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


--
Michael I. Lichter, Ph.D. <[email protected]>
Research Assistant Professor & NRSA Fellow
UB Department of Family Medicine / Primary Care Research Institute
UB Clinical Center, 462 Grider Street, Buffalo, NY 14215
Office: CC 125 / Phone: 716-898-4751 / FAX: 716-898-3536

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Unexpected proportions after survey commands
  - From: Jean-Gael Collomb <[email protected]>

Prev by Date: RE: st: Unexpected proportions after survey commands
Next by Date: st: RE: Missing standard errors in multinomial logit
Previous by thread: st: Unexpected proportions after survey commands
Next by thread: Re: st: Unexpected proportions after survey commands
Index(es):
- Date
- Thread