Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: ML-Evaluator for modelling retirement decisions

From   Tibor Hanappi <>
To   "" <>
Subject   RE: st: ML-Evaluator for modelling retirement decisions
Date   Wed, 4 Jan 2012 13:08:02 +0000

Thanks a lot for your answers!

I guess the problem is that I tried to estimate the parameters all in one stage (while in fact it should be two stages). So, if I wanted to go with your first option, what would be the recommended way to do a grid search in STATA? 
If I understand you correctly, there would be no need (in that case) to program an -ml- evaluator. The idea would be to start with a grid search (for ALPHA and GAMMA) based on some initial values for the (remaining) coefficient vector, then go on to compute the OV, estimate the standard -probit- model and start again with a new grid search. In this iterative procedure ALPHA and GAMMA would be passed on from stage one to stage two and the resulting coefficient vector will be passed on to stage one again. The whole procedure can then be repeated until some convergence criterium is reached. 
The alternative would be to develop a two-stage model entirely whithin the -ml- program. Do I understand you correctly?

Thanks again! 

(Below is what I wanted to answer to Nick)

Let me try to make my point a bit clearer.
(1) The matrix b[,] should refer to the coefficient vector as defined by the -ml model- statement:
ml model gf0 ML_OPV (theta: GO = SSW OV gn_age) /hp_alpha /hp_gamma
In this case I would expect b[,] to be a row-vector with six elements, i.e. one parameter estimate for each of the variables: SSW, OV, gn_age, constant, hp_alpha and hp_gamma. The likelihood function will be maximized with respect to the elements of this vector.

(5)+(4) As long as I assume that alpha and gamma are constant I'm perfectly fine with the standard model. In that case I can compute OV once (based on the exogenous parameters) and then estimate a standard binary probit model on SSW, OV, gn_age and a constant. I would thus have four parameter estimates and the parameter for OV would measure the effect of OV on current retirement given (exogenous) alpha and gamma.
But this is unsatisfactory since it implies that the two parameters used to calculate the option value are exogenous. I was therefore trying to write a likelihood-evaluator that includes parameter estimates for alpha and gamma, and evaluates the likelihood function based on all six parameters.

(2) My idea was to initialize two new parameters (each defined over a specific range) and let the program recalculate (and replace) OV at each iteration based on the current values. Since hp_alpha and hp_gamma can take on any value between minus and plus infinity, I made use of the normal cdf to map those two parameters onto the desired intervall. This is why I replace the two variables by a transformation of the corresponding entries in the coefficient vector b[1,5] and b[1,6]. Since the likelihood function is maximized with respect to the coefficient vector b[,] changes in b[1,5] and b[1,6] will affect the value of alpha and gamma at each step of the maximization procedure (they do). Though these parameters do not enter the likelihood function directly, they lead to changes in the OV variable and should thus affect the likelihood function (that's also the case).

(3) You're right. I will change that.

From: [] on behalf of Nick Cox []
Sent: Wednesday, January 04, 2012 10:59 AM
Subject: Re: st: ML-Evaluator for modelling retirement decisions

References in the minimal form (name, date) are deprecated on
Statalist. (See for example Buis, 2011; Cox, 2011 for reminders of
this point.)

There seems to be confusion here on several levels. First, you refer
to b[,], evidently a matrix, but it is not clear how your program can
access it. Second, your -ml- program always resets alpha and gamma, so
 as you say the iteration can only repeat the same step. Third, it is
not good style to put parameters in variables. Fourth, it is not clear
why the same utility calculation needs to be repeated each time that
-ml- iterates. Fifth, this does sound like a standard model so I
wonder why you are programming it independently.

The main point about -ml- is that it does most of the work for you. It
is usually not necessary to initialise parameters explicitly unless
the model has to strain to fit the data.

I am not an economist and I don't understand this kind of model, so
you will need to look to others for further comment.


On Wed, Jan 4, 2012 at 9:29 AM, Tibor Hanappi <> wrote:
> Im modelling retirement decisions based on an option value framework (Gruber and Wise, 2002). Up until now I have constructed a microsimulation model calculating the option value for each individual in each year. To make it short, the option value is a forward-looking variable that summarizes the future options (with regard to retirement) of an individual at a certain point in time.
> Based on this approach I estimate a binary probit model with retirement in the current year as dependent variable and option value (OV), social
> security wealth (SSW), age and some other variables as covariates.
> However, since the option value is denoted in utility terms I have to assume some exogenous parameters of the utility function. There are basically
> two parameters: GAMMA, which defines marginal utility of income, and ALPHA, which is a factor defining the utility gain through leisure in retirement (relative to work). Exogenous values taken from the literature would be: GAMMA=0.75 and ALPHA=1.36
> As a next step I’m writing a maximum likelihood evaluator so that I can jointly estimate those two parameters together with the binary probit model. Since I wanted to keep it simple I’m using a gf0 evaluator. Also, I had to make sure that the two parameters stay whithin their ranges (0 to 1 for GAMMA and 1 to 2 for ALPHA), so I transform parameters b[1,5] (hp_alpha) and b[1,6] (hp_gamma) through the use of the normal cdf. Though my program passes ml check, it doesn’t converge. In fact, it seems to be unable to go on to the next iteration (#1) though it keeps on repeating every step in the program. Here is a reduced version of the code.
> program ML_OPV
> args todo b lnfj
> tempvar alpha gamma
> quietly gen double `alpha'=1+normal(`b'[1,5])
> quietly gen double `gamma'=normal(`b'[1,6])
> *display "ALPHA: " `alpha'
> *display "GAMMA: " `gamma'
> quietly {
> forvalues j=2002(1)2012 {
>      * Calculate Utility from Retirement based on `alpha' and `gamma'
>      < CODE OMITTED >
>      * Calculate Utility from Labour Income based on `gamma'
>      < CODE OMITTED >
> }
> tempvar xb
> gen double `xb' = `b'[1,1]*SSW + `b'[1,2]*OV + `b'[1,3]*gn_age +`b'[1,4]
> replace `lnfj' = ln(normal(`xb')) if $ML_y1 == 1
> replace `lnfj' = ln(normal(-1*`xb')) if $ML_y1 == 0
> }
> end
> ml model gf0 ML_OPV (theta: GO = SSW OV gn_age) /hp_alpha /hp_gamma
> ml init SSW=.000004 OV=-.0008 gn_age=.16 /hp_alpha=-.001 /hp_gamma=.5
> ml max, technique(nr) trace showstep
> I recognize that it might be quite hard to help me out from the distance, however, it would be greatly appreciated. Especially, I’m wondering whether my approach concerning ALPHA and GAMMA is valid or whether there is any easier way to do it.

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index