Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | "Dimitriy V. Masterov" <dvmaster@gmail.com> |
To | Statalist <statalist@hsphsun2.harvard.edu> |
Subject | st: tobit, margins, and prediction with outcome in logs |
Date | Wed, 17 Nov 2010 22:57:23 -0500 |
I am trying to estimate a censored regression model of expenditures, where about a third of my observations have a zero value. I transformed the expenditure variable by taking the log and it passed the LM test for normality. I also can't reject the null of homoskedasticity with an auxiliary regression, so I think the statistical assumptions for a tobit are satisfied. I am interested in seeing how the average latent variable prediction of exp depends on the value of the variable x1 and a dummy variable di. I would like to use margins to do something like this: margins, predict(ystar(`e(llopt)',.)) at(x1==(0(10)100)) over(di), but I cannot figure out how to get margins to perform the transformation back to y from ln(y) with the expression() option. Instead, I tried to do the following: /* Trick Stata to handle the log transformation */ gen lny=ln(exp); qui sum lny; scalar gamma=r(min); replace lny = gamma - 0.0000001 if lny==.; /* Estimate the model */ tobit lny x1 x2 x3 di, ll; matrix btobit=e(b); scalar sigma=btobit[1,e(df_m)+2]; /* Transform back to E[y|x] */ forvalues v=0(10)100 {; replace x1=`v'; predict xb if e(sample), xb; generate yhat`v'=exp(xb+0.5*sigma^2)*(1-normal((gamma-xb-sigma^2)/sigma)); drop xb; }; collapse (mean) yhat*, by(di); Does this accomplish what I think it does? Is there a better way of doing this or estimating that does not require the re-transformation business? I don't really have any sensible exclusion restrictions to try a two-part model. Dimitriy * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/