At 06:07 PM 6/23/2009, kokootchke wrote:

Dear all,I have a model that says that the return or yield spread of a bondissued by a country depends non-linearly on the country'sprobability of default. If I assume that this probability of defaultfollows a logistic form, I get that the log spread depends linearlyon "stuff" which I take to be macroeconomic variables. To choose thebest model, I use AIC/BIC.One interesting fact I observe is that in some cases, I see thatboth AIC and BIC select a model that contains some variable X evenwhen a lot of data points are missing for that particular variable,which means I actually lose a lot of observations when I includesuch variable X.More specifically, I have: MODEL 1 regress log_spread a b c X estat ic which gives AIC = 915 then, MODEL 2 regress log_spread a b c estat ic which gives AIC = 1500but the OLS in model 1 uses 1200 observations while the OLS in model2 uses 2800 observations (because 1600 observations are missing invariable X)!!You would think that this would be because X is very relevant toexplain the spread, but in fact I see some cases when this variableis statistically insignificant!!

I'm guessing a fairer comparison would be nestreg, lr: reg log_spread (a b c) X

