[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Lachenbruch, Peter" <Peter.Lachenbruch@oregonstate.edu> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
RE: zero inflated beta [was: st: Information request] |

Date |
Thu, 13 Aug 2009 08:32:54 -0700 |

The situation seems to be a hurdle model or two-part model. It is related to zero inflated Poisson or negative binomial. In this case, the zeros are identifiable. So the problem is related to checking for common proportions and equal slopes among the non-zero variables. Here are some references to get you started. Lachenbruch, P. A., (2001) "Comparison of competitors to the two part model" Statistics in Medicine 20(8) 1215-1234 Lachenbruch, P. A. (2001) "Power and Sample Size Requirements for Two-part models" Statistics in Medicine 20(8) 1235-1239 Lachenbruch, P. A. (2002) "Analysis of Data with Excess Zeros" Statistical Methods in Medical Research 11(4) 297-302 The last reference is an introduction to a special issue of SMMR devoted to the issues of excess zeros. Some papers are on mixture models (like zip or zinb) and some to identifiable models. Tony Peter A. Lachenbruch Department of Public Health Oregon State University Corvallis, OR 97330 Phone: 541-737-3832 FAX: 541-737-4001 -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Maarten buis Sent: Thursday, August 13, 2009 1:34 AM To: statalist@hsphsun2.harvard.edu Subject: zero inflated beta [was: st: Information request] --- On Wed, 12/8/09, Fabio Zona wrote: > I am in the unfortunate situation of running a regression > analysis, whereby: > - my dependent variable is a proportion (percentage of > bonus on total compensation of top managers of 178 > corporations), > - the majority (more than 50%) of my managers does not have > any bonus, so the proportion is exact ZERO, that is, my > dependent variable has many exact zeros. > > How can I estimate this model? do you know the command I > should use in Stata? > I know that I cannot use the fractional logit because I > have many zeros. I have not found any zero-inflated logistic > regression for situations whereby y are proportion A zero inflated fractional logit model is hard to identify. A zero-inflated beta is probably better, but there is obviously a price (there is no such thing as a free lunch...), and that is more restrictive assumptions. Below is a quick stab at implementing such a model. I haven't done any checking or certification on it, so it is up to you to determine whether this is program actually does what it is supposed to do. As a first step I would build a simulated dataset where you know what the parameters should be and check whether this program actually finds those. Hope this helps, Maarten *----------- begin example --------------- clear program drop _all set more off input prop str1 site variety 0.0005 A 1 0.0000 A 2 0.0000 A 3 0.0010 A 4 0.0025 A 5 0.0005 A 6 0.0050 A 7 0.0130 A 8 0.0150 A 9 0.0150 A 10 0.0000 B 1 0.0005 B 2 0.0005 B 3 0.0030 B 4 0.0075 B 5 0.0030 B 6 0.0300 B 7 0.0750 B 8 0.0100 B 9 0.1270 B 10 0.0125 C 1 0.0125 C 2 0.0250 C 3 0.1660 C 4 0.0250 C 5 0.0250 C 6 0.0000 C 7 0.2000 C 8 0.3750 C 9 0.2625 C 10 0.0250 D 1 0.0050 D 2 0.0001 D 3 0.0300 D 4 0.0250 D 5 0.0001 D 6 0.2500 D 7 0.5500 D 8 0.0500 D 9 0.4000 D 10 0.0550 E 1 0.0100 E 2 0.0600 E 3 0.0110 E 4 0.0250 E 5 0.0800 E 6 0.1650 E 7 0.2950 E 8 0.2000 E 9 0.4350 E 10 0.0100 F 1 0.0500 F 2 0.0500 F 3 0.0500 F 4 0.0500 F 5 0.0500 F 6 0.1000 F 7 0.0500 F 8 0.5000 F 9 0.7500 F 10 0.0500 G 1 0.0010 G 2 0.0500 G 3 0.0500 G 4 0.5000 G 5 0.1000 G 6 0.5000 G 7 0.2500 G 8 0.5000 G 9 0.7500 G 10 0.0500 H 1 0.1000 H 2 0.0500 H 3 0.0500 H 4 0.2500 H 5 0.7500 H 6 0.5000 H 7 0.7500 H 8 0.7500 H 9 0.7500 H 10 0.1750 I 1 0.2500 I 2 0.4250 I 3 0.5000 I 4 0.3750 I 5 0.9500 I 6 0.6250 I 7 0.9500 I 8 0.9500 I 9 0.9500 I 10 end encode site, gen(sitenum) gen byte left = sitenum <= 4 program define zibeta_lf *! MLB 0.0.1 13 Aug 2009 version 8.2 args lnf logitmu lnphi zb tempvar zero nonzero mu phi quietly gen double `zero' = invlogit(`zb') quietly gen double `nonzero' = invlogit(-`zb') quietly gen double `mu' = invlogit(`logitmu') quietly gen double `phi' = exp(`lnphi') quietly replace `lnf' = ln(`nonzero') + /// lngamma(`phi') - /// lngamma(`mu'*`phi') - /// lngamma((1-`mu')*`phi') + /// (`mu'*`phi'-1)*ln($ML_y) + /// ((1-`mu')*`phi'-1)*ln(1-$ML_y) /// if $ML_y > 0 quietly replace `lnf' = ln(`zero') if $ML_y == 0 end xi i.site i.variety ml model lf zibeta_lf (logitmu: prop = _I*) /lnphi (zg:left), robust ml check ml search ml maximize exit *--------------- end example ---------------------- ----------------------------------------- Maarten L. Buis Institut fuer Soziologie Universitaet Tuebingen Wilhelmstrasse 36 72074 Tuebingen Germany http://home.fsw.vu.nl/m.buis/ ----------------------------------------- * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Information request***From:*Fabio Zona <fabio.zona@unibocconi.it>

**zero inflated beta [was: st: Information request]***From:*Maarten buis <maartenbuis@yahoo.co.uk>

- Prev by Date:
**Re: st: AW: AW: Plotting 3 way continuous interactions in regression** - Next by Date:
**st: Calinski & Duda stop rule values** - Previous by thread:
**zero inflated beta [was: st: Information request]** - Next by thread:
**Re: zero inflated beta [was: st: Information request]** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |