Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: How to compare performance (goodness-of-fit) of very different modelling approaches?


From   Eva Poen <[email protected]>
To   [email protected]
Subject   Re: st: How to compare performance (goodness-of-fit) of very different modelling approaches?
Date   Fri, 16 Jan 2009 15:39:56 +0000

Thanks, Austin. My outcome variable has indeed 21 "categories", but
from an economic point of view it can be considered continuous. 21
values seem a bit too many for a -byhist-, but it is a nice program,
and I will certainly consider using it in some way to aid
visualization.

Eva


2009/1/15 Austin Nichols <[email protected]>:
> Eva Poen <[email protected]>:
> Does the outcome variable have only 21 categories {0,...20} or is it continuous?
> Maybe you could produce 21 histograms (with fractions for each of the
> models overlaid or "interlaced" on one graph) characterizing the
> distribution of observed values for those predicted to have Y=0, ...
> 20.  See -byhist- on SSC for making interlaced histograms.
>
> On Thu, Jan 15, 2009 at 11:59 AM, Eva Poen <[email protected]> wrote:
>> Dear all,
>>
>> currently I am working on slightly complicated mixture models for my
>> data. My outcome variable is bounded between 0 and 20, and has mass at
>> either end of the interval. Whether or not I analyse the data on the
>> original [0,20] scale or a transformation to [0,1] (fractions) does
>> not make any difference to me.
>>
>> My question concerns the goodness of fit. I would like to compare the
>> goodness fit of the complicated finite mixture model to much simpler
>> models, e.g. the tobit model, the glm model. and a hurdle
>> specification. Since the likelihood values of these models differ
>> substantially, likelihood based measures such as BIC appear to be
>> inadequate for the purpose. Also, measures that compare the model
>> likelihood of the fitted model to the null likelihood ("pseudo r2")
>> are difficult sine I can calculate them for the tobit and glm models,
>> but not for the mixture model, as it is unclear what the null model
>> would be.
>>
>> So far I have been looking at crude measures like correlation between
>> predicted outcome and actual outcome, but I feel that this is
>> inadequate, especially since the outcome variable is bounded. I'd be
>> grateful for hints and comments. I am working with Stata 9.2.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index