Meg Dennison

statalist@hsphsun2.harvard.edu

st: Using sampling/probability weights for mixed design ANOVA in STATA

Fri, 27 May 2011 10:36:33 +1000

Hi All, Steven thanks for your reply. I have inserted my answers below. But The description of your data unclear. You refer to one between-subject and two within-subject "variables", but to "the" (single?) repeated measures variable with two levels. Isn't this a within-subject variable?. By two levels do you mean two occasions (if longitudinal)? Which, if any variables (besides subject), do you consider to be "random effects"? - I am looking at brain development over time. I have collected data on brain measures at two time points for each subject (the repeated measure - baseline and follow up). Additionally, these brain measures involve collecting from both the left and the right hemisphere within a single person - and are not independent, so they are being treated as another within subjects variable (hemisphere - left and right). The between subjects variable is sex (obviously, male and female). So please clarify what the variables are and list the data for some subjects, so that we can see where you are starting from,. So, the data would look like this: Subject BrainVol Time Hemisphere 1 1345 1 left 1 2345 2 left 1 3546 1 right 1 3457 2 right etc In any case, for complex survey data, the standard errors for estimates are governed by variation of primary sampling units (PSUs, first-stage clusters) within strata, so the usual ANOVA formulas would not ordinarily apply. Stata can analyze some mixed model designs with survey data. Some other questions that will help us suggest analyses: 1. What is the sampling design? If there were strata, do they correspond to the "between-subject" variable? The sampling design involved postcodes being randomly selected across a metropolitan city. Within these postcodes (strata?), schools were randomly selected to participate (clusters?). All Grade 5 classes within these schools were asked to complete a survey (obviously not all consented or were present at school that day etc). The survey they completed consisted of four factors. Two of these factors were used to select subjects for further participation - the probability of being selected is the probability weights that I have based on this sampling bias. From this initial sample of about 2500, 400 were invited to participate in the research, and from those who were invited, I have 101 who participated in my study. The variable on which they were initially sampled does not correspond to sex - the BS variable in my study. I am not interested in the variable on which the sampling bias was introduced - my data is derived from a larger research project for which this initial sampling bias was desirable. 2. Are replicate (bootstrap, jackknife, BRR) weights available? Did the survey distributor provide SAS or SPSS macros to compute them? No, the selection was not done using these programs. 3. What questions are you trying to answer. What parameters do you hope to estimate or test in your analysis? I am interested in describing typical brain development - how it changes over time by sex and hemisphere, and their interaction. I believe that the initial sample of 2500 was reasonably representative of normally developing children (obviously with the caveats of being living in a certain country, being at school, living in city etc etc). I would like to correct for the sampling bias that was introduced. Thanks in advance Meg 4. What version of Stata do you have> Version 11. On Tue, May 24, 2011 at 11:54 PM, Steven Samuels <sjsamuels@gmail.com> wrote: > > Hi, Meg. > > Welcome to Stata! You will find that Stata's regression and survey capabilities are both far superior to those of SPSS. > > But The description of your data unclear. You refer to one between-subject and two within-subject "variables", but to "the" (single?) repeated measures variable with two levels. Isn't this a within-subject variable?. By two levels do you mean two occasions (if longitudinal)? Which, if any variables (besides subject), do you consider to be "random effects"?

So please clarify what the variables are and list the data for some subjects, so that we can see where you are starting from,.

In any case, for complex survey data, the standard errors for estimates are governed by variation of primary sampling units (PSUs, first-stage clusters) within strata, so the usual ANOVA formulas would not ordinarily apply. Stata can analyze some mixed model designs with survey data.

Some other questions that will help us suggest analyses:
1. What is the sampling design? If there were strata, do they correspond to the "between-subject" variable?
2. Are replicate (bootstrap, jackknife, BRR) weights available? Did the survey distributor provide SAS or SPSS macros to compute them?
3. What questions are you trying to answer. What parameters do you hope to estimate or test in your analysis?
4. What version of Stata do you have>

Steve
sjsamuels@gmail.com

On May 23, 2011, at 9:20 AM, Meg Dennison wrote:

Hi,

I have a complex sample, for which I need to use sampling weights
(probability weights). I already have these values derived from the
initial sampling selection. I wanted to then perform a mixed design
ANOVA (with 2 within subjects variables and one between subjects
variable).The repeated measures variable only has 2 levels.

I have only used SPSS before and the Complex Sampling Add-on module
only allows for univariate ANOVA. Can STATA perform this type of
analysis? From what I could see from looking at the GUI and reading
the manual, probability weights (pweights) could not be used for mixed
ANOVA?

Is there another way I should be thinking about this?

Thanks in advance for your help,


Kind regards,

Meg

--

Meg Dennison BA(Hons) MPsych(Clin)/PhD Candidate
School of Psychological Sciences, University of Melbourne
megd@student.unimelb.edu.au

