[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Trend analysis - independent surveys

From   Steven Samuels <>
Subject   Re: st: Trend analysis - independent surveys
Date   Wed, 30 Jul 2008 11:34:55 -0500

Ángel,   As sampling was independent in the two years, you need not  
identify the PSU's that appeared in both. Do not merge PSU's from the  
two years-in fact, you cannot, because a PSU may appear in only one  
stratum. You imply that stratum names do not overlap between the two  
surveys. If some names do overlap, then recode so that they do not.   
For example, if have stratum with a number 5 in both waves, change  
the 2001 version to 105 and the 2007 version to 705. (Feel free to  
choose your own renumbering method)  You will not need to change the  
PSU names,  because Stata will treat them as unique if they appear in  
different strata.  Now -append- the two data sets and  -svyset- with  
the stratum variable, PSU variable, weights, and fpc.

(You might also find it of interest to do a separate paired analysis  
of the PSUs that had observations in both waves. If you see, for  
example, a certain change in means or percentages between 2001 and  
2007 in the entire population, you might want to examine this in the  
paired PSUs.  For this analysis you would need to ignore stratum,  
ignore the original target population weights, and recompute weights  
so they reflect sampling probabilities and post-stratification totals  
for each year within the PSU. I would also treat these PSU's as  
fixed, so that standard errors are based on within-PSU variation.)


On Jul 30, 2008, at 2:41 AM, Ángel Rodríguez Laso wrote:

> I have a related question so I take advantage of the thread.
> I want to compare answers to identical questions (in more or less the
> same positions in the questionnaire) in two waves (2001 and 2007) of a
> survey (not panel) with the same target population.
> In both waves, a multistage selection procedure was followed: In the
> first stage, PSUs were selected at random after stratification, but
> strata were defined differently in each wave (seven geographical areas
> in the first wave and eleven health areas in the second, not
> overlapping). PSUs in both waves were defined identically and many of
> them appear in both samples. In the second stage, individuals were
> selected at random within PSUs. A finite population correction had to
> be used because around one third of the PSUs in each stratum were
> sampled. Sample weights were used in both waves because of differing
> selection probabilities and poststratification adjustments.
> If I merge both waves, would it be correct to merge also strata and
> PSU variables from both waves? Notice that PSUs names are identical
> and some individuals from both waves will belong to the same PSU, but
> the same PSU will belong to a different stratum in each wave, because
> strata names differ in both waves.
> Many thanks.
> Ángel
> 2008/7/25, Stas Kolenikov <>:
>> If the rounds are independent, then
>> t=(avergage[round t]-average[round s])/sqrt(variance_t [average in
>> round t]+variance_s [average in round s])
>> is normal / t with sum of design degrees of freedom / t with
>> Satterthwaite corrected degrees of freedom, depending on how you want
>> to think about them. The quick and dirty solution is to save the four
>> above quantities as locals or scalars and form this t-statistic. If
>> you had strata and cluster IDs with some sort of insider access, you
>> could put those together with
>> use dataset1
>> append dataset2
>> *********
>> * make sure PSU labels are different in two years
>> *********
>> svy: sum whatever , over[year]
>> lincom [whatever]year2 - [whatever]year1
>> With separate bootstrap weights, that would not necessarily work,  
>> I am
>> afraid. If you have the same number of replicate weights in both
>> periods, it might.
>> On Fri, Jul 25, 2008 at 9:41 AM, Mark Latendresse
>> <> wrote:
>>> Hello,
>>> I have data on 7 independent cross-sectional surveys 1999-2007  
>>> (not panels)
>>> for which the target populations were identical. We would like to  
>>> test for
>>> trends among several sub-groups on average number of cigarettes  
>>> smoked per
>>> day (continuous variable). How can I do a trend analysis in Stata  
>>> that will
>>> take into account sampling weights and stratification? We have  
>>> bootstrap
>>> weights for each survey, however, we can also obtain the design  
>>> effects for
>>> each survey if necessary.
>> --
>> Stas Kolenikov, also found at
>> Small print: I use this email account for mailing lists only.
>> *
>> *   For searches and help try:
>> *
>> *
>> *
> *
> *   For searches and help try:
> *
> *
> *

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index