Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: comparing 2 surveys; testing means and distributions; using weights

From   Steve Samuels <>
Subject   Re: st: comparing 2 surveys; testing means and distributions; using weights
Date   Thu, 28 Jun 2012 14:22:03 -0400

To be sure of what to advise you, I would want to see a description of the sample
frame, design, and the sampling plan for each survey. I don't understand
your statement that sampling weights were "constructed based on demographic
targets". I would guess you mean that post-survey adjustments were applied to
the original sampling weights.

Assuming that you have accurately described the situation:

* You can test differences between groups. You do not need to -svyset- the data,
but it will saved typing [pw = ] for many commands; if you do -svyset-, then
designate the surveys as strata.

* To compare statistics other than totals, it does not matter that the weights
are scaled differently. If you wish to compare totals, you will have to rescale
the "sum to sample size" weights to the same weight total of the other survey.

* To compare distributions, I suggest that you categorize continuous variables
and use svy variants of -tab- and -mlogit- or -ologit- and -prop-. -mgof- (by Ben Jann,
from SSC) does not compare two distributions but instead tests the fit of a
single sample to a distribution specified by equation. You can also graph
distribution functins using -cumul- with -aweights-.

I notice that you are at Laval University. If you have questions about the
technical survey documents, I suggest that you consult one of the excellent
statisticians there.


On Jun 27, 2012, at 8:58 PM, Marie-Hélène Felt wrote:

Hello all,
I am working on comparing 2 independent survey datasets that stem from relatively close questionnaires (done same year, in Canada with national representativity targetted). I have some doubts about how to proceed using Stata. 
Sampling weights are supplied in both datasets. They are constructed based on demographic targets such as age, income or city size (these targets variables differ across surveys). I don't think there is any cluster or strata. 

My questions:
1) I wasn't sure to be allowed to combine both datasets into a sigle 1, but from what I have read in the Statlist archive I can do just that and create 2 strata indicating the 2 surveys? Will Stata understand that there are 2 surveys/ won't Stata mess up weights? 

2) the weights, as provided, are not scaled the same way. For one dataset, the mean is one and so the represented population size equals the sample size (around 5000). For the other dataset, the weights are such that the population size is huge (around 20,000,000). Is that an issue? Should I rescale these last weights? 

3) My goal is to compare answers/variables of the 2 surveys/datasets. I want to first test if means or proportions differences across surveys are significant. Once I have combined both datasets, can I just do t tests between groups [groups that are also strata]? 

4) I would also like to test equality of distributions, not only means. I know the mgof command for categorical variables.Is there a way to test continuous distributions taking weights/the survey dimension into account?

thanks a lot in advance for your tips!


*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index