Sounds like a not-so-practical model. How much time is it going to
take you to look at all 2500 coefficients???
Also, what kind of nonlinearity do you have? Generally diagnosing
nonlinearity requires a lot of data; if you want to deal with 2500
variables, then you would probably need 100 thousands observations to
identify both linear and nonlinear effects.
You do remember that introducing dummy variables only leads to the
fixed effect estimation in the linear models, as the underlying
statistical principle is that of sufficient statistics: when one
exists, fixed effect estimation is feasible, otherwise it is not; thus
the class of models is restricted to the linear, logit and Poisson
models.
Finally, your model may not be (empirically) identified in the region
near the maximum; in the simplest textbook case, you may have all the
dummies along with the constant, and then some random values -ml
check- picks may indeed give different likelihoods, but not near the
top. In more complex cases, you may have your likelihood collapse for
certain combinations of parameters: suppose you have something like
b1*b2*b3 in your likelihood, and it is estimated to be zero. If the
true value of b1 is zero, then neither b2 nor b3 are identified. This
does happen in nonlinear models sometimes, and it can be cured by
reparameterization. You need to do your paper-and-pencil work on that
though.
On 4/4/06, Jason DeBacker <debacker@eco.utexas.edu> wrote:
> Hi Statalisters,
>
> I'm having trouble with the estimation of a non-linear likelihood
> function on panel data. My program passes all the tests in ml check
> and goes through ml seach, but in ml max I get the following error:
> "could not calculate numerical derivatives
> flat or discontinuous region encountered"
>
> The model I'm trying to estimate is in Groseclose, Levitt, and Snyder's
> 1999 article in the APSR. I think the problem might be that since
> there are a lot of dummy variables in the model and each one seldom
> takes on a value other than zero (at most a particular dummy variable =
> 1 in only 1% of the obs), there are problems calculating the numerical
> derivatives. I've been able to estimate a simplified version of the
> model when I have dummy variables that takes on a value of one 50% of
> the time.
>
> Does it sound like I have a correct diagnosis of the problem?
> Suggestions to fix this?
>
> I'd like to run this in Stata so I could get standard errors more
> easily, but I have started writing a Matlab program to do the
> estimation. Additionally, Matlab is a bit cumbersome with the number
> of dummy variables I have (over 2500). Will Matlab be able to estimate
> the model faster than Stata?
>
> Thanks very much for any help.
>
> Sincerely,
> Jason
>
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
--
Stas Kolenikov
http://stas.kolenikov.name
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/