Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.

Re: st: an estimation method question

 From Maarten buis To statalist@hsphsun2.harvard.edu Subject Re: st: an estimation method question Date Tue, 16 Mar 2010 16:16:52 +0000 (GMT)

```--- On Tue, 16/3/10, Xiang Ao <xao@hbs.edu> wrote:
> We are trying to estimate an equation which has some
> constraints by observation.  We are studying founders
> of firms.  The dependent variable is the share of the
> firm.  We are studying what factors influence the
> shares.  We now have a problem:  for each firm,
> the shares necessarily sum to one.  We were thinking of
> dropping an observation per firm, but it turns out the
> results are dependent on which observation to drop.
> Any suggestions on estimation methods would be helpful.

With this type a data the key distinction is whether you
have 2 or more than 2 categories whose proportions should
add up to 1. In the former case you either use -betafit-
or -glm varlist , link(logit) family(binomial) vce(robust)-.
In the latter case you can use -dirifit- or -fmlogit-.

you can install -betafit-, -dirifit-, and -fmlogit- by
typing in Stata:
ssc install betafit
ssc install dirifit
ssc install fmlogit

-betafit-, -glm-, and -dirifit- were discussed in this talk
at the 2006 London Stata Users' Group meeting:
http://ideas.repec.org/s/boc/usug06.html

The -glm- trick is based on this paper:
Papke, Leslie E. and Jeffrey M. Wooldridge. (1996)
"Econometric Methods for Fractional Response Variables with
an Application to 401(k) Plan Participation Rates". Journal
of Applied Econometrics, 11(6):619-632.

The -glm- trick is also discussed in the Stata tip:
Christopher F. Baum (2008) "Stata tip 63: Modeling proportions"
The Stata Jouranl, 8(2): 299--303.
http://www.stata-journal.com/article.html?article=st0147

The model implemented in -betafit- was discussed in a number
of papers:
Ferrari, S.L.P. and Cribari-Neto, F. (2004). "Beta regression
for modelling rates and proportions". Journal of Applied
Statistics 31(7): 799-815.

Paolino, P. (2001). "Maximum likelihood estimation of models
with beta-distributed dependent variables". Political Analysis,
9(4): 325-346.

Smithson, M. and Verkuilen, J. (2006) "A better lemon squeezer?
Maximum likelihood regression with beta-distributed dependent
variables". Psychological Methods, 11(1): 54-71.

-fmlogit- is basically a generalization of the -glm- trick to
multiple categories.

examples can be found here:
http://www.maartenbuis.nl/software/betafit.html
http://www.maartenbuis.nl/software/dirifit.html
http://www.maartenbuis.nl/software/fmlogit.html

Hope this helps,
Maarten

--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany

http://www.maartenbuis.nl
--------------------------

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```