[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Maarten buis <maartenbuis@yahoo.co.uk> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: The dependent variable is a multi-proportion in actual values |

Date |
Mon, 15 Jun 2009 09:55:04 +0000 (GMT) |

--- On Sat, 13/6/09, jverkuilen wrote: > > See John Aitchison, 2003, Compositional Data Analysis. > > -dirifit- implements the Dirichlet model, which is highly > > restrictive. Otherwise you need to transform the proportions > > and use a multivariate multiple regression type approach. --- On Sun, 14/6/09, sjsamuels@gmail.com wrote: > -fmlogit- by Maarten Buis (downloadable from SSC) does > regression on such fractional or compositional data. To add a bit of context: When you think of linear regression, -regress-, you model two elements of the dependent variable: the mean (and how it changes over the explanatory variables) and the variance conditional on the expalantory variables (i.e. the variance of the error term, this is shown in the output of -regress- as "root MSE"). You are modeling multiple dependent variables (proportion spent on food, on cloths, and on recreation), so appart from the mean and the variance you also have the covariance between the dependent variables. -dirifit- assumes that this covariance is always negative (For the exact forumula see page 32 of http://home.fsw.vu.nl/m.buis/presentations/UKsug06.pdf ). This can make sense: if you spent more on clothing then there is less income left to spent on recreation or food. But this does not necesarily have to be the case: We could imagine that "Fun-loving-people" would spent high proportions on both clothing and recreation, thus creating a positive correlation between the two. The correlation structure of -dirifit- does not allow for this possibility and can thus be considered to be pretty restricted. Often (but not always) we only care about the how the means (i.e. predicted proportions) changes when the explanatory variables change, the variances and covariances are in that case just nuisance parameters. If you have a large sample than you can use Quasi-likelihood to get correct inference even if you mis-specify the model of the nuisance parameters. This is what -fmlogit- does. The basic idea is discussed in (Papke and wooldridge 1996). A critique on quasi-likihood / robust standard errors in general can be found in (freedman 2006). Hope this helps, Maarten Freedman, David A. (2006) On The So-Called "Huber Sandwich Estimator" and "Robust Standard Errors", The American Statistician, 60(4), pp. 299-302. Papke, Leslie E. and Jeffrey M. Wooldridge. (1996) Econometric Methods for Fractional Response Variables with an Application to 401(k) Plan Participation Rates, Journal of Applied Econometrics 11(6):619-632. ----------------------------------------- Maarten L. Buis Institut fuer Soziologie Universitaet Tuebingen Wilhelmstrasse 36 72074 Tuebingen Germany http://home.fsw.vu.nl/m.buis/ ----------------------------------------- * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**RE: st: The dependent variable is a multi-proportion in actual values***From:*timothy adler <timothy_adler@hotmail.com>

- Prev by Date:
**Re: st: Imputing values from bracketed responses** - Next by Date:
**st: 2 issues with 64-bit Stata on Mac OS X** - Previous by thread:
**Re: st: The dependent variable is a multi-proportion in actual values** - Next by thread:
**RE: st: The dependent variable is a multi-proportion in actual values** - Index(es):

© Copyright 1996–2022 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |