Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Grid Search in a Log Plus Constant Model


From   "Le Wang" <statauser@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Grid Search in a Log Plus Constant Model
Date   Wed, 8 Nov 2006 18:00:53 -0600

Paul,

Your problem is very similar to selection problem. Although you
mentioned you would rather not use models for both zeroes and positive
values, I think they are probably the right way to go.

I guess what you refer to "modelling the data separately for the
zeroes and the positive values" is the approaches used in the
literature to solve the selection problem. If so, they are actually
not "modelling separately the data" instead of modelling them
together.

Deb and Trevidi (2002) use two-part model to solve the similar problem
and Cameron and Trivedi (2005) (Section 16.6 p553) discuss
alternatives including bivariate sample selection models. Cameron has
the corresponding Stata codes on his website to implement these
methods. Hope it helps.

Le

On 11/8/06, paul d jacobs <jacobspauld@yahoo.com> wrote:
I am working with health data (MEPS) where
out-of-pocket medical expenses (OOP) are a dependent
variable in an OLS regression.  Because of the
positive skewness of such a variable, I would like to
use a normalizing transformation, i.e. the log of OOP.
 However, because of the many zero observations for
OOP, the options are to either add a constant to OOP,
(some have used $1 arbitrarily), or to model the data
separately for the zeroes and the positive values,
which I'd rather not do.  (I have also considered the
square root transformation, etc., but would like to
test out the results using a log-constant).

My question is:  do you know of a method for searching
for the optimal constant to add to a variable so that
a log-transformation produces the optimal result?  Deb
et al. (2005), suggest a 'grid search' for this value
(see link below for document).  I know that grid
searches are used in the context of maximum
likelihood; is this a similar process?  Would running
the model with different values and comparing R2s and
standard errors be more appropriate?

Thanks very much for your time!

Paul Jacobs
Ph.D. Candidate, Economics
American University


Link to Deb, et al presentation:
harrisschool.uchicago.edu/faculty/articles/iHEAminicourse.pdf




____________________________________________________________________________________
Sponsored Link

Get a free Motorola Razr! Today Only!
Choose Cingular, Sprint, Verizon, Alltel, or T-Mobile.
http://www.letstalk.com/inlink.htm?to=592913
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


--
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Le Wang, Ph.D.
Minnesota Population Center
University of Minnesota
(o) 612-624-5818
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index