Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Clyde Schechter" <clyde.schechter@einstein.yu.edu> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Advice on xtmixed specification,pre/post two-group design |

Date |
Thu, 28 Jul 2011 08:41:48 -0700 |

Brandon, Without actually working with your data directly it is hard for me to say too much more in specific terms. But a couple of thoughts: First, check that the three approaches (xtmixed, ANCOVA, regression for change score) are actually being estimated on the same sample. These three approaches can differ in the way missing data affect case inclusion--so be sure that the number of pairs being analyzed in each case are the same. If not, you are applying the models to different samples and all bets are off. Assuming that the estimation sample is indeed the same for all the analyses, then look at the meaning of the three different models. To simplify the writing of equations, I'm going to ignore the teacher level in your data--the point I'm making doesn't depend on that in any way. Notationally, let's call y the score variable, X the vector of covariates (including treatment group, interactions, etc.). Let _i mean subscript for the i'th subject, and _j (j = 1,2) denote the pre- and post- conditions. The Hierarchical Linear Model (xtmixed with score as dependent variable) is: y_ij = a + bX_ij + u_i + e_ij, with the usual assumptions about the u's and e's being iid, independent of each other, expectation zero... Note that X_ij may include variables that change between pre- and post-, such as time and time#control. It is also permissible for other covariates to change between pre- and post-. ANCOVA is a somewhat different model (given that you have another level of nesting you are not, strictly speaking, doing ANCOVA, but the idea is the same for our present purposes): y_i2 = a' + b'X'_i + cy_i1 + e'_i Note that in this case X_i may not contain any variables that change between pre- and post- conditions because there is only one observation per pre/post pair. If there are such variables in your data, then in setting up this analysis you had to have somehow excluded time-varying covariates or selected which value you entered into the analysis. Clearly that has to be done systematically and meaningfully in light of the science of which value better predicts y_i2, or it may be that the pre- and post- values of such covariates both enter separately in the analysis. Whatever the case may be in your situation, double-check that you have set this up properly. You can see, though, that the covariate vector X_i may look markedly different from the X'_ij vector in the HLM. This may lead to different inferences about effects of particular covariates, even about covariates that are common to both X and X'. Now, assuming that none of the foregoing complications are involved in your situation, think about intra-class correlation (ICC). The HLM model as written forces a non-negative ICC = (Var u/(Var u + var e)). In fact, if Var u is close to zero you probably will have problems getting the estimation to converge, so for practical purposes, the HLM model forces ICC >> 0. While this is more often than not the case, there are situations where ICC is negative--so consider whether that might be the case in your situation. If it is, you cannot use the HLM--it is completely misspecified. There is nothing exactly analogous to the ICC in the ANCOVA model, but you can see that the magnitude and sign of the coefficient c capture the same concept--but c is not constrained to be non-negative. In fact, the ANCOVA model doesn't constrain c at all--it is a freely estimated parameter of the model. The change score model, finally, is, in effect, ANCOVA with the constraint c = 1. Now, if the reality is that the c = 1 constraint is far off the mark, this misspecification will lead to biased estimates of the b' coefficients. So, if the science in your field doesn't make a clear a priori statement about which (if any) of these models best reflects your data generating process, looking at the coefficient c in the ANCOVA model may give you a sense of whether the HLM or change-score models are bad specifications for your data. Anyway, after verifying that your data management has been done correctly, the choice of which model is best for your situation depends on what the science in your field tells you about how these different model specifications match up with the underlying data generating process. There is no uniform, generic answer to the question of which approach is best. Hope this helps. Clyde Schechter, MA MD Associate Professor of Family & Social Medicine Please note new e-mail address: clyde.schechter@einstein.yu.edu * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**Re: st: Completely new version of -outreg-** - Next by Date:
**st: new version of -xglm- on SSC** - Previous by thread:
**Re: st: Advice on xtmixed specification,pre/post two-group design** - Next by thread:
**st: forcing loops through errors** - Index(es):