[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Vickers, Andrew J./Integrative Medicine" <vickersa@mskcc.org> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
st: Proportion as a dependent variable |

Date |
Thu, 17 Jul 2003 09:56:44 -0400 |

Ronnie Babigumira asked whether linear regression was appropriate for a proportion. Many wrote back to point out that proportions involved binary data and linear regression is for continuous outcomes. Ronnie then clarified that the proportion was a single value between 0 and 1 for each observation, in this case, the percentage of field space allocated to new variety maize for each farmer. My tuppence, with an open call for comment, is that many areas in medical research and psychometrics have similar properties to the problem Ronnie raises. For example, pain is often measured on a 0 - 100 scale; quality of life scales such as the SF36 convert various numerical scores into a proportion of the maximum score to give a quality of life between 0 and 100. Biostatisticians have used linear regression for many years without worrying too much about it, unless there was a particular reason: as Nick Cox put it, it all depends on the data and the use to which it is being put. If the dependent variable is normally distributed with a mean of 0.5 and an SD of 0.1, linear regression is probably going to work fine. If the dependent variable has many 0's and / or 1's, as might well be the case with the maize data, you might have a problem, particular that you regression will make out of sample predictions. My guess is that with the maize data, differences between say, 55% and 65% aren't neither important nor likely as farmers will plant certain whole areas with a particular crop. Thus you could categorize the data into quartiles (0-24.9%, 25%-49.9%, 50% - 74.9%, 75%- 100%) and then do an ordinal regression. Andrew Vickers Memorial Sloan-Kettering Cancer Center ===================================================================== Please note that this e-mail and any files transmitted with it may be privileged, confidential, and protected from disclosure under applicable law. If the reader of this message is not the intended recipient, or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any reading, dissemination, distribution, copying, or other use of this communication or any of its attachments is strictly prohibited. If you have received this communication in error, please notify the sender immediately by replying to this message and deleting this message, any attachments, and all copies and backups from your computer. * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**st: RE: Proportion as a dependent variable***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

**Re: st: Proportion as a dependent variable***From:*"R. Allan Reese" <R.A.Reese@hull.ac.uk>

- Prev by Date:
**st: Re: in not if for large panels** - Next by Date:
**st: RE: xtreg fixed effect with time trend; constant** - Previous by thread:
**Re: st: proportion as a dependent variable** - Next by thread:
**Re: st: Proportion as a dependent variable** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |