Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: logY, Tobit and the prediction of Y

From   n j cox <>
Subject   Re: st: logY, Tobit and the prediction of Y
Date   Wed, 23 Apr 2003 14:27:49 +0100

Mathew Stalker replied to Christer Thrane:

My initial model is:

Y = a + b1x1 + controls + e

where Y is expenditures on a commodity and x1 is income.

Since there are a lot of zeroes, I use the Tobit apporach. However, since
the log-linear model performed better than the linear, I use the former
(Before the log transformation of Y, I follow convention and set zeroes
to 1.)

Accordingly, the estimated Tobit model is:

logY = a + b1x1 + controls + e

The problem:

I want to predict the value of Y (not logY) for certain values of income
(and put it in a graph); that is, both the conditional Y (i.e. the Y

that the threshold value of 0 was passed) and the unconditional (latent)
value of Y.

Does anyone know how to do this?

The prediction of Y from your model would simply be the exponential of the
predicted logY.

However, you should note that the log of zero is minus infinity, so in your
log model no observations where Y is zero will be included.  Is this really
what you want?

Replacing 0s by 1s is clearly not very satisfactory. Using generalised
linear models with log link makes _that_ unnecessary.

glm Y <predictors>, link(log)

This approach has two extra advantages. First, it automatically
yields predictions on the scale of the response, here Y.
Second, the back-transformation approach mentioned by
Mathew raises bias issues, well documented in some
literatures (for some reason, there is masses on
this within health economics) which don't arise in the GLM


*   For searches and help try:

© Copyright 1996–2016 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index