Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: comparing 2 surveys; testing means and distributions; using weights


From   Steve Samuels <sjsamuels@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: comparing 2 surveys; testing means and distributions; using weights
Date   Thu, 28 Jun 2012 14:22:03 -0400


To be sure of what to advise you, I would want to see a description of the sample
frame, design, and the sampling plan for each survey. I don't understand
your statement that sampling weights were "constructed based on demographic
targets". I would guess you mean that post-survey adjustments were applied to
the original sampling weights.

Assuming that you have accurately described the situation:

* You can test differences between groups. You do not need to -svyset- the data,
but it will saved typing [pw = ] for many commands; if you do -svyset-, then
designate the surveys as strata.

* To compare statistics other than totals, it does not matter that the weights
are scaled differently. If you wish to compare totals, you will have to rescale
the "sum to sample size" weights to the same weight total of the other survey.

* To compare distributions, I suggest that you categorize continuous variables
and use svy variants of -tab- and -mlogit- or -ologit- and -prop-. -mgof- (by Ben Jann,
from SSC) does not compare two distributions but instead tests the fit of a
single sample to a distribution specified by equation. You can also graph
distribution functins using -cumul- with -aweights-.

I notice that you are at Laval University. If you have questions about the
technical survey documents, I suggest that you consult one of the excellent
statisticians there.

Steve
sjsamuels@gmail.com






On Jun 27, 2012, at 8:58 PM, Marie-Hélène Felt wrote:

Hello all,
I am working on comparing 2 independent survey datasets that stem from relatively close questionnaires (done same year, in Canada with national representativity targetted). I have some doubts about how to proceed using Stata. 
Sampling weights are supplied in both datasets. They are constructed based on demographic targets such as age, income or city size (these targets variables differ across surveys). I don't think there is any cluster or strata. 

My questions:
1) I wasn't sure to be allowed to combine both datasets into a sigle 1, but from what I have read in the Statlist archive I can do just that and create 2 strata indicating the 2 surveys? Will Stata understand that there are 2 surveys/ won't Stata mess up weights? 

2) the weights, as provided, are not scaled the same way. For one dataset, the mean is one and so the represented population size equals the sample size (around 5000). For the other dataset, the weights are such that the population size is huge (around 20,000,000). Is that an issue? Should I rescale these last weights? 

3) My goal is to compare answers/variables of the 2 surveys/datasets. I want to first test if means or proportions differences across surveys are significant. Once I have combined both datasets, can I just do t tests between groups [groups that are also strata]? 

4) I would also like to test equality of distributions, not only means. I know the mgof command for categorical variables.Is there a way to test continuous distributions taking weights/the survey dimension into account?

thanks a lot in advance for your tips!

MHF


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index