# st: Multilevel modeling question

 From Allan Garland To statalist@hsphsun2.harvard.edu Subject st: Multilevel modeling question Date Fri, 11 Nov 2005 10:26:11 -0500

Dear All,

I am trying to model a 2-level problem using -xtmixed-, but cannot find the right way to do this from the manual.

Here's the problem: I'm studying hospital resource use for patients cared for by a group of doctors (call the resource/dependent variable COST). I have a number of characteristics of the patients that relate to COST (e.g. age, severity of illness, etc). --->

What I REALLY want to know is the magnitude of the differences among the physicians. So, for example, an ANOVA (followed by -predict- or -adjust-) that includes the patient-level indep variables and a categorical variable representing the different doctors (call it DOCNUM) shows me (by looking at the coefficient and significance for DOCNUM) that there are substantial differences among the doctors in COST. OK so far --->

But now I want to look deeper and evaluate the influence of characteristics of the doctors themselves (e.g. their years in practice, board certification status, etc) on COST, i.e. having the information from the ANOVA I now want to see how much of the differences between the doctors in COST is "mediated by" these characteristics of the doctors themselves. Clearly just putting these doctor-level variables into the ANOVA as if they were patient-level variables is incorrect (as discussed in many places, e.g. Snijkers & Bosker's book "Multilevel Analysis"). So, I'm trying to figure out how to code the syntax of -xtmixed- to do this task. Unfortunately, in the Stata 9 XT manual there are no examples given under -xtmixed- for how to correctly include variables which are only relevant at the higher level --- by contrast in HLM I believe you do this explicitly. I do not believe (but am not certain of this) it is correct to just put these doctor-level variables into the fixed-effect equation such as: xtmixed COST patient_age severity doc1-docN MD_age || DOCNUM: or even
xtmixed COST patient_age severity doc1-docN MD_age || DOCNUM: MD_age
Part of the reason I think this must be wrong is that in my dataset each of the doctors has a different age, and thus the N dummy variables (doc1 to docN) created for the N+1 doctors turns out to be degenerate with the MD_age variable and the program automatically eliminates one of them. Since my goal is to see what proportion of the effect of the physicians can be explained by THEIR characteristics, it seems I must find a way to keep the physician identifier variables (either doc1-docN or DOCNUM) in the fixed effects part of this model --- or do I????

I also don't think it's correct to leave the doctor identifier out of the fixed effect model (e.g. xtmixed COST patient_age severity || DOCNUM: MD_age) because then I don't get coefficients that tell me about the magnitude of the effects of the physicians as a group, or of their age on COST.

So, if anyone out there can help me with this syntax problem, I'd really appreciate it. Also, if you also can tell me how, from the output, I can tell the proportion of the variation in COST attributable to the doctors can, in fact, be explained by their characteristics (as included in the model, e.g. MD_age), I'd appreciate that too.

Thanks so much,

Allan

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/