In fact, bias is not the only issue. What you propose
is quite likely to lead to predictions above 100%,
even possibly within the range of the data.
There are various possibilities other than -regress-.
One is a generalised linear model with logit link.
You would need to scale from percents to proportions,
but that is easy. (By the sound of it, your data are
proportions already.) There is more on that at
FAQ . . . . . . . . . . . . . . . . . . . . . . . . . Logit transformation
8/04 How does one estimate a model when the dependent
variable is a proportion?
http://www.stata.com/support/faqs/stat/logit.html
Another is a model using a beta-distributed response.
This is supported by -betafit- from SSC.
Maarten Buis gave a nice paper on this at the last
London users' meeting which is accessible at
http://repec.org/usug2006/Buis_proportions.pdf
My recollection is that there is a vein of literature
on this problem in health economics. Other members may be able
to help out with details.
Your second problem sounds more difficult. There is work
on using the Dirichlet as response implemented in -dirifit- on SSC, but
I suspect that rescaling each prediction by dividing by the sum
of the predictions is by far the easiest way to deal with this.
I would even take the extent to which your predictions have the correct
total to be one indication of how far your model is
sensible.
Nick
n.j.cox@durham.ac.uk
jc1926@gmx.de
> Using ‘regress’, I have estimated a model in which the
> dependent variable is the natural log of a percentage (that
> is, I take ln of a variable that varies between 0 and 1,
> which represents market shares over different years). I have
> two questions:
>
> 1. After fitting the model, I want to estimate the
> predicted values and convert back to a percentage by taking
> the antilog. It is my understanding that this introduces
> bias. How can I generate predicted values, expressed as
> percentages, that are not biased?
>
> 2. Even if I am able to generate unbiased predictions, I
> suspect that when I sum them up, they will not add to one for
> each year in the data. Is there any logical way of imposing
> the restriction that the transformed predicted values sum to
> one? In a 1995 Econometrica article (63(4): 841-890), Berry,
> Levinsohn, and Pakes employ a “simulator for market shares”
> that has this property, but its unclear how to implement in Stata.
>
> Any insights would be greatly appreciated.
>
> -jc
>
> --
> "Feel free" - 5 GB Mailbox, 50 FreeSMS/Monat ...
> Jetzt GMX ProMail testen: www.gmx.net/de/go/mailfooter/promail-out
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/