I am working with health data (MEPS) where
out-of-pocket medical expenses (OOP) are a dependent
variable in an OLS regression. Because of the
positive skewness of such a variable, I would like to
use a normalizing transformation, i.e. the log of OOP.
However, because of the many zero observations for
OOP, the options are to either add a constant to OOP,
(some have used $1 arbitrarily), or to model the data
separately for the zeroes and the positive values,
which I'd rather not do. (I have also considered the
square root transformation, etc., but would like to
test out the results using a log-constant).
My question is: do you know of a method for searching
for the optimal constant to add to a variable so that
a log-transformation produces the optimal result? Deb
et al. (2005), suggest a 'grid search' for this value
(see link below for document). I know that grid
searches are used in the context of maximum
likelihood; is this a similar process? Would running
the model with different values and comparing R2s and
standard errors be more appropriate?
Thanks very much for your time!
Paul Jacobs
Ph.D. Candidate, Economics
American University
Link to Deb, et al presentation:
harrisschool.uchicago.edu/faculty/articles/iHEAminicourse.pdf
____________________________________________________________________________________
Sponsored Link
Get a free Motorola Razr! Today Only!
Choose Cingular, Sprint, Verizon, Alltel, or T-Mobile.
http://www.letstalk.com/inlink.htm?to=592913
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/