Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: dependent var. that is a logged percentage

From   "Nick Cox" <>
To   <>
Subject   st: RE: dependent var. that is a logged percentage
Date   Tue, 27 Feb 2007 10:20:48 -0000

In fact, bias is not the only issue. What you propose
is quite likely to lead to predictions above 100%, 
even possibly within the range of the data. 

There are various possibilities other than -regress-. 

One is a generalised linear model with logit link. 
You would need to scale from percents to proportions, 
but that is easy. (By the sound of it, your data are 
proportions already.) There is more on that at

FAQ     . . . . . . . . . . . . . . . . . . . . . . . . . Logit transformation
        8/04    How does one estimate a model when the dependent
                variable is a proportion?

Another is a model using a beta-distributed response. 
This is supported by -betafit- from SSC. 
Maarten Buis gave a nice paper on this at the last 
London users' meeting which is accessible at

My recollection is that there is a vein of literature 
on this problem in health economics. Other members may be able
to help out with details. 

Your second problem sounds more difficult. There is work 
on using the Dirichlet as response implemented in -dirifit- on SSC, but 
I suspect that rescaling each prediction by dividing by the sum 
of the predictions is by far the easiest way to deal with this. 
I would even take the extent to which your predictions have the correct
total to be one indication of how far your model is 

> Using ‘regress’, I have estimated a model in which the 
> dependent variable is the natural log of a percentage (that 
> is, I take ln of a variable that varies between 0 and 1, 
> which represents market shares over different years). I have 
> two questions:
> 1.	After fitting the model, I want to estimate the 
> predicted values and convert back to a percentage by taking 
> the antilog. It is my understanding that this introduces 
> bias. How can I generate predicted values, expressed as 
> percentages, that are not biased?
> 2.	Even if I am able to generate unbiased predictions, I 
> suspect that when I sum them up, they will not add to one for 
> each year in the data. Is there any logical way of imposing 
> the restriction that the transformed predicted values sum to 
> one? In a 1995 Econometrica article (63(4): 841-890), Berry, 
> Levinsohn, and Pakes employ a “simulator for market shares” 
> that has this property, but its unclear how to implement in Stata.
> Any insights would be greatly appreciated.
> -jc
> -- 
> "Feel free" - 5 GB Mailbox, 50 FreeSMS/Monat ...
> Jetzt GMX ProMail testen:
> *
> *   For searches and help try:
> *
> *
> *

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index