Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Analysis of experiment involving baseline measurements

From	Philip Jones <[email protected]>
To	[email protected]
Subject	Re: st: Analysis of experiment involving baseline measurements
Date	Tue, 05 Jul 2011 16:50:02 -0400

Thank you Clyde for your helpful response.

To answer your queries and to request more assistance, I provide the following
information. Again, I apologize for the long email in advance!

Because this intervention involves training that I expect to improve
performance, and that this performance is retained to some extent, I expect the
baseline measurement to be lower than *both* of the subsequent two measurements.
However, I also expect the third measurement to be lower than the second (but
higher than the baseline), given our natural inclination to forget details of
education interventions over time. So, I would expect 'tapering', as you put it.

I have done as you suggested and reshaped my data to long format, giving a table
like this (where group and time variables have labels, but are really integers):

     +---------------------------------------+
     | ID        group        time   outcome |
     |---------------------------------------|
  1. |  1     didactic    baseline        14 |
  2. |  1     didactic   immediate        24 |
  3. |  1     didactic     six_wks        17 |
     |---------------------------------------|
  4. |  2   simulation    baseline        12 |
  5. |  2   simulation   immediate        23 |
  6. |  2   simulation     six_wks        22 |
     |---------------------------------------|
  7. |  3     didactic    baseline        18 |
  8. |  3     didactic   immediate        24 |
  9. |  3     didactic     six_wks        19 |
     |---------------------------------------|
 10. |  4   simulation    baseline        16 |
 11. |  4   simulation   immediate        23 |
 12. |  4   simulation     six_wks        23 |
     |---------------------------------------|
[snip]

I subsequently ran the model as:

-- xtmixed outcome i.group##i.time || ID: --

Which gives me the following output:

[snip]
------------------------------------------------------------------------------
     outcome |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     1.group |      -1.25   1.837117    -0.68   0.496    -4.850684    2.350684
             |
        time |
          1  |        8.5   1.660405     5.12   0.000     5.245666    11.75433
          2  |          4   1.660405     2.41   0.016     .7456661    7.254334
             |
  group#time |
        1 1  |       1.25   2.348167     0.53   0.594    -3.352323    5.852323
        1 2  |        4.5   2.348167     1.92   0.055    -.1023231    9.102323
             |
       _cons |         14   1.299038    10.78   0.000     11.45393    16.54607
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
ID: Identity                 |
                   sd(_cons) |   1.111805   .8665695      .2413128    5.122442
-----------------------------+------------------------------------------------
                sd(Residual) |   2.348167   .4793176      1.573904     3.50332
------------------------------------------------------------------------------
LR test vs. linear regression: chibar2(01) =     0.55 Prob >= chibar2 = 0.2282


I want to compare the groups to each other (for the 'outcome') at the two "test"
time points (non-baseline for time). i.e. I want to compare whether or not the
didactic group at 2.time is different from the simulation group at 2.time. I am
doing this as so:

-- test 2.time = (1.group + 2.time + 1.group#2.time) --

which gives me a simplified version and a P value:

 ( 1)  - [outcome]1.group - [outcome]1.group#2.time = 0

           chi2(  1) =    3.13
         Prob > chi2 =    0.0769

I am obtaining the confidence interval for the 'outcome' at varying time points
as such:

-- ci outcome if group==1 & time==2 --

Questions:
===========

1) Is the -- test -- command as I have structured it above the correct way to
obtain the P values for between-group comparisons at varying time points?

2) What command can I use for a confidence interval of the *difference* in
'outcome' between the two groups at a certain time point?

3) In the --xtmixed-- command, am I really 'controlling' for baseline values in
the traditional sense, or am I just including that (baseline) time in the model?
In other words, at 1.time and 2.time, are these parameter estimates actually
adjusted for baseline performance for the specific group as they would be in OLS
regression? Am I actually including baseline time 'outcome' as a covariate as
Clyde suggests in his message below?

4) How can I meaningfully make use of the random effects portion of this model?

Many thanks in advance for any assistance.

Phil


> Phil Jones asks for advice in adjusting for baseline measurements when
> analyzing data with two follow-up points.
> 
> You need to first think about what theory underlies the intervention and
> the implications for how the outcome score will evolve over time--the
> modeling will depend on that.  Do you expect both groups to improve from
> pre to post and continue to improve at 6wks?  If so, will they continue to
> improve at the same rate as from pre- to post-, or will there be a
> tapering off (or an acceleration)? Or do you expect the scores to
> deteriorate somewhat at 6 wks?
> 
> If you -reshape- your data into long format, you can model any of these
> possibilities using -xtmixed- or -xtreg-.  The independent variables
> specification may involve a single degree-of-freedom specification of
> time, or time as a factor variable, or perhaps as a spline. And your
> representation of time will then have interaction terms with group.  You
> will also have the option of either including the baseline value as a
> covariate (and not analyzing time = pre observations) or not.  But you
> have to have a model of the time-trajectories of the output in mind to
> make the corresponding decisions.
> 
> Hope this helps you make progress.
> 
> 
> Clyde Schechter
> Department of Family & Social Medicine
> Albert Einstein College of Medicine
> Bronx, NY, USA
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: Re: st: For each observation, total values
Next by Date: Re: st: sample size
Previous by thread: st: calculate gains and losses from stock prices
Next by thread: st: comparing logistic regression coefficients between samples
Index(es):
- Date
- Thread