Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Panel-data maximum likelihood likelihood : trouble with mlsum

From	"Jakusch, Sven-Thorsten" <[email protected]>
To	"[email protected]" <[email protected]>
Subject	st: Panel-data maximum likelihood likelihood : trouble with mlsum
Date	Tue, 17 Apr 2012 15:53:06 +0200

Dear  Statalisters,
I am quite new to stata and have already found some answers in the net and various textbooks but it seems I’m stuck here. 
For a recent project, I try to perform a panel-data probit likelihood estimation to unravel the preference parameters of investors on individual level. The whole approach I try to perform is based on a loose adaption of Harrison`s (2008) “Maximum Likelihood Estimation of Utility Functions Using Stata”, in which I implemented a trading model, which I want to calibrate now using ML. 
 The original dataset is similar to the structure in Gould, Pitblado and Sribney (2006) “Maximum Likelihood Estimation with Stata”, p. 110. It looks more or less like this (I hope it is not too scrambled):

Obs. No.	 Investor _ID	Security_ID 	Date	   Choice	     Charact. Of Sec.	L.L.
1	      	1		          1		         t_0	   1				.
2		1		          1		         t_1	   1	   			.
3		1		          1		         t_2	   0	   			ln(L1)
4		1		          2		         t_0	   1	   			.
5		1		          2		         t_1     1	   			.
6		1		          2		         t_2	   1	   			.
7		1		          2		         t_3     0	   			ln(L2)
8		2		          1		         t_0	   1	   			.
…and so on....

As I’m interested which characteristics of the securities might also have an impact on the hold (=1) or sell (=0) decision of the respective investor, I inted to generate “sub”-log-likelihood functions at the end of each observation of “Choice” for each security and aggregate the sum of these log-likelihoods at investor-level.

A sketch of my attempts looks like this (I´m using a stata 10.0 version and did already the "update query".):

.....
program define ML_My_problematic_model_1 // define maximum likelihood program for the panel dataset

args todo b lnf  //define variables and coefficient vector b
tempvar alpha lambda gamma  last lj **some more variables** utility_diff 

mleval `alpha' = `b', eq(1) 
mleval `lambda' = `b', eq(2) 
mleval `gamma' = `b', eq(3) //Variables of interest are alpha, lambda and gamma

quietly {	**contains more or less specifications of the model** // define likelihood function per security_ID:

by security_ID: gen double `utility_diff'=`utility_alternative_2' -`utility_alternative_1' //here I tried to generate the sub-likelihood functions for each security_ID (in line with Harrison (2008) as mentioned above)
	by security_ID: gen byte `last'=_n==_N //construct likelihood for utility difference under iid assumption:
	gen double `lj'=.
	by security_ID: replace `lj' =(normal(`utility_diff')) if $ML_y1==0
	by security_ID: replace `lj' =(normal(-`utility_diff')) if $ML_y1==1
mlsum `lnf' = ln(`lj') if `last'==1 //sum the added likelihood functions at the end of each security_ID
if (`todo'==0 | `lnf'>=.) exit
}
end
.....

It is more or less the indented middle part that worries me: My problem is now, that I try to generate a sub-ln-likelihood function at the last observation of the group, here “security_ID” for the “Choice” variable. Trying to sum them up with mlsum returns only a likelihood function of 0 or an algorithm that generates an error message, stating that numerical derivatives are flat or not obtainable- no matter what I do.  The ml check works fine and indicates no serious issues. I obtained the results (if not interrupted after 300 iterations) using:

statsby [alpha]_cons [lambda]_cons [gamma]_cons, by(person_ID) clear: ml model d0 ML_My_problematic_model_1 (alpha: Choice  **a lot of other variables** = ) (lambda: ) (gamma: ), maximize technique(dfp nr)

To check the results  I got with this program, I created a “test”-sample to play around, for which I wrote the likelihood-program for this particular investors aswell but in which the securities are listed side by side. This program surprisingly works quite fine but would be (obviously) messy to apply for all investors (if it helps I can also post it here). 
I can´t see a large difference to the code shown in Gould, Pitblado and Sribney (2006), p. 111 which confuses me. I hope the information is sufficient to make a first statement about this problem and to give me an indication where my flaws are..

Thank you very much in advance !

Sven 

Sven-Thorsten Jakusch

Lehrstuhl für BWL, insbesondere Finanzen - Prof. Dr. Andreas Hackethal
House of Finance |  Grüneburgplatz 1 | 60323 Frankfurt am Main
Tel.: + 49 (0)69 798-33677
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: Re: st: extraction and appending of data
Next by Date: st: mlogit coefs
Previous by thread: st: extraction and appending of data
Next by thread: st: mlogit coefs
Index(es):
- Date
- Thread