Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Linear regression of 'log' predictors


From   "Maarten Buis" <M.Buis@fsw.vu.nl>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: Linear regression of 'log' predictors
Date   Wed, 27 Sep 2006 15:56:58 +0200

Ashwin:
First, given the name of your dependent variable "length of stay" I presume that it measures some duration (duration till leaving). In that case I would strongly recommend using Stata's survival time models, either -stcox- or -streg-. If you don't have any censoring (people who haven't yet left when you stopped collecting data) than using log(los) as dependent variable is equivalent to using -streg-, with the distribution(lnormal) option. However it is very unlikely that you have no censoring, in which case -streg- is by far preferable. I have written a short introduction to survival analysis, which you can get from http://home.fsw.vu.nl/m.buis/wp/survival.html . It also contains some links to other sites which information on survival analysis. 

As for interpretation, say you have one explanatory variable called female (0 = male, 1 = female) and you find a regression coefficient of -4, than the average duration is 4% less for females than for males.

HTH,
Maarten

-----------------------------------------
Maarten L. Buis
Department of Social Research Methodology 
Vrije Universiteit Amsterdam 
Boelelaan 1081 
1081 HV Amsterdam 
The Netherlands

visiting adress:
Buitenveldertselaan 3 (Metropolitan), room Z434 

+31 20 5986715

http://home.fsw.vu.nl/m.buis/
-----------------------------------------

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu]On Behalf Of Ashwin Ananthakrishnan
Sent: woensdag 27 september 2006 15:32
To: statalist@hsphsun2.harvard.edu
Subject: st: Linear regression of 'log' predictors

I have a model where the outcome is length of stay
(los). This variable has some right skew and is not
perfectly 'normal'.

Is it valid for me to run linear regression of other
predictors on length of stay if the los is not
normally distributed?

If it is not valid, then log (los) is a normally
distributed variable. But how do I interpret the
coefficients of the log(los). I find that
exponentiating log(los) coefficient doesn't seem to be
appropriate as it doesn't yield valid results. For
example p>0.05, but the 95% CI don't overlap 'zero'
which is what I would expect in linear regression.
Also exp(log(los)) doesn't give a similar estimate as
the coefficients if I run the regression on los
directly.

I apologize in advance if my question is either to
basic or difficult to understand.
Thank you.


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index