# Re: st: clustered regression--in over my head

 From "Ada Ma" To statalist@hsphsun2.harvard.edu Subject Re: st: clustered regression--in over my head Date Fri, 12 May 2006 11:45:22 +0100

```if you are simply looking at the correlation between the number of
visits and the BMI, then you shouldn't use the -cluster- command at
all.  currently you have multiple observations per patient, but you
should have only one observation per patient.  if you have more than
one BMI measurement per patient, you should take the average.

if as you said, you want to compare the growth rate of BMI to the
number of exits, the easiest way to do is to take the difference
between the BMI of the first and last visits, and correlate that to
the number of visits.

because you are looking at one form of panel data, just taking up time
series skills wouldn't be sufficient, you'll need to pick up some
panel data skills.  two years sound like a very short time frame
anyway.  in particular I would recommend you to think about attrition
- those who come for more visits are those who are still alive and
those who didn't move away, or for whatever reasons, left the
practice.

On 5/11/06, Christopher W. Ryan <cryan@binghamton.edu> wrote:
```
```I am trying to work with one of my residents who is studying obesity in
our practice.  We have a dataset consisting of about 150 patients, who
collectively made about 700 visits over the course of two years.  They
were selected for study because their BMI > 30.  Number of visits per
person ranges from 1 to over 15.

Pertinent variables are:
mrn = medical record number
bmi = body mass index
visitnumber =  the sequential visit number for a patient:  first visit,
second visit, etc.

One question we are interested in is:  does the BMI tend to increase
over time?

I can't just look at mean BMI by visit number.  That statistic does
increase, but we suspect that increasingly heavier people simply tend to
make more visits, because of more severe comorbidities.

So here's where I get into deep water.  Using Stata 8, I tried this:

-regress bmi visitnumber, cluster(mrn)-

which produces (abbreviated for compactness):

Regression with robust standard errors    Number of obs =     766
F(  1,   149) =    4.54
Prob > F      =  0.0348
R-squared     =  0.0413
Number of clusters (mrn) = 150            Root MSE      =  6.8239

-----------------------------------------------------------------
|               Robust
bmi |      Coef.   Std. Err.      t    P>|t|
-------------+------------------------------------------
visitnumber |   .2778606   .1304586     2.13   0.035
_cons |   36.10313   .5658324    63.81   0.000
--------------------------------------------------------

Am I on the right track with this?  I have never used -cluster()-
before.  I won't be learning anything about time series analysis until
the fall, and I wonder whether it might be appropriate in this situation.

Thanks.

--Chris

--
Christopher W. Ryan, MD
SUNY Upstate Medical University Clinical Campus at Binghamton
and Wilson Family Practice Residency, Johnson City, NY
cryanatbinghamtondotedu
GnuPG and PGP public keys available at http://pgp.mit.edu

"If you want to build a ship, don't drum up the men to gather wood,
divide the work and give orders. Instead, teach them to yearn for the
vast and endless sea."  [Antoine de St. Exupery]
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

```
```
-

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```