[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Model selection using AIC/BIC and other information criteria

From   kokootchke <[email protected]>
To   statalist <[email protected]>
Subject   st: Model selection using AIC/BIC and other information criteria
Date   Tue, 23 Jun 2009 19:07:02 -0400

Dear all,

I have a model that says that the return or yield spread of a bond issued by a country depends non-linearly on the country's probability of default. If I assume that this probability of default follows a logistic form, I get that the log spread depends linearly on "stuff" which I take to be macroeconomic variables. To choose the best model, I use AIC/BIC.

One interesting fact I observe is that in some cases, I see that both AIC and BIC select a model that contains some variable X even when a lot of data points are missing for that particular variable, which means I actually lose a lot of observations when I include such variable X.

More specifically, I have:


regress log_spread a b c X
estat ic

which gives AIC = 915



regress log_spread a b c
estat ic

which gives AIC = 1500

but the OLS in model 1 uses 1200 observations while the OLS in model 2 uses 2800 observations (because 1600 observations are missing in variable X)!!

You would think that this would be because X is very relevant to explain the spread, but in fact I see some cases when this variable is statistically insignificant!!

Can any of you explain this? 

Alternatively, could you tell me whether there are any other useful stats I could look at?

Thank you very much!


Bing™  brings you maps, menus, and reviews organized in one place.   Try it now.
*   For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index