[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Roger Newson <roger.newson@kcl.ac.uk> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
RE: st: RE: RE: RE: proportion as a dependent variable |

Date |
Mon, 14 Jul 2003 22:05:40 +0100 |

At 20:22 14/07/03 +0200, Ronnie Babigumira wrote:

If the Y-variable is a proportion rather than a binary variable, then you can still use either -regress- with Huber variances, or -glm- with identity link and binomial family, or even -glm- with log link and binomial family if you want multiplicative effects. The -glm- command will warn you that your Y-variable is not binary, but will still do as it is asked. The main problem with homoskedastic (equal-variance) linear regression is that, if the Y-variable is a proportion, then the conditional variance is not likely to be independent of the conditional mean, because proportions sampled from a distribution with a mean near 0.5 can vary more than proportions sampled from a distribution with a mean near 0 or 1. The -family- option of -glm- simply optimises the estimation under a particular assumption about mean-variance relationship, in order to minimize the width of the confidence intervals if that assumption is true. If you also use the -robust- option, then your standard errors will still be consistent, even if you do not guess the mean-variance relationship right first time. I myself would probably not simply use area under new maize as the Y-variable and area under total maize as an X-variable, because I would expect the effect of total maize area on area under new maize to be multiplicative rather than additive.Dear Laurel, Nick, Roger, Joao Pedro, Giogio and Todd, Many thanks for your comments. To put things in perspective, the presenter was studying new maize varieties and sought to identify some socio economic factors that may explain the adoption of these varieties. All respondents in her sample grew some maize (traditional, improved or both) so her dependent variable was area under improved varieties (which would then be handled easily in a censored regression framework or better still as a corner solution outcome). However, she argued that the area allocated needed to be adjusted for total area under maize (if one has 1 acre and allocated 0.5 acres to the maize then in terms of adoption, this should not be the same as someone with 10 acres of maize land but also allocates 0.5 acres) hence the dependent variable was total area under new maize/ total maize area (hence the proportion). From Laurels email, it would imply that all the independent variables should also be divided by the maize area, while Nicks email points out (correctly) that while the dependent variable lies between 0 and 1, using OLS does not guarantee that the predicted values of y will lie between 0 and 1 (which is one of the main arguments against the Linear Probability Model). Roger points to a binary dependent variable however the dependent variable here is not quite binary. Joao Pedro suggests something that the presenter actually did, while I still need to think thru Giorgios suggestion and I am just going to read thru the paper suggested by Todd In the light of the "added flesh" to the problem, I would appreciate your comments on the best way to proceed (for example, would just including the total maize area as one of the independent variables be a sufficient control)

I hope this helps.

Roger

--

Roger Newson

Lecturer in Medical Statistics

Department of Public Health Sciences

King's College London

5th Floor, Capital House

42 Weston Street

London SE1 3QD

United Kingdom

Tel: 020 7848 6648 International +44 20 7848 6648

Fax: 020 7848 6620 International +44 20 7848 6620

or 020 7848 6605 International +44 20 7848 6605

Email: roger.newson@kcl.ac.uk

Website: http://www.kcl-phs.org.uk/rogernewson

Opinions expressed are those of the author, not the institution.

*

* For searches and help try:

* http://www.stata.com/support/faqs/res/findit.html

* http://www.stata.com/support/statalist/faq

* http://www.ats.ucla.edu/stat/stata/

**References**:**Re: st: RE: RE: RE: proportion as a dependent variable***From:*Todd Wagner <twagner@stanford.edu>

**RE: st: RE: RE: RE: proportion as a dependent variable***From:*"Ronnie Babigumira" <ronnie.babigumira@ios.nlh.no>

- Prev by Date:
**[no subject]** - Next by Date:
**st: xi variable labels** - Previous by thread:
**RE: st: RE: RE: RE: proportion as a dependent variable** - Next by thread:
**Re: st: RE: proportion as a dependent variable** - Index(es):

© Copyright 1996–2022 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |