Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Model selection using AIC/BIC and other information criteria


From   Richard Williams <Richard.A.Williams.5@ND.edu>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>, statalist <statalist@hsphsun2.harvard.edu>
Subject   RE: st: Model selection using AIC/BIC and other information criteria
Date   Tue, 23 Jun 2009 22:20:36 -0500

At 08:39 PM 6/23/2009, kokootchke wrote:
Thank you, Richard. This was exactly what I thought... but I remember from my metrics classes long time ago that both AIC and BIC depend on N (sample size)... and I confirmed this by simply looking at these wikipedia entries... but, just like you, I also feared that, even though both criteria adjust for the sample size, maybe you can't compare between AICs and BICs when the models use different # of observations...

Here is a simple example that shows the sensitivity of BIC and AIC to sample size:

. sysuse auto, clear
(1978 Automobile Data)

. quietly reg  price mpg trunk weight

. estat ic

-----------------------------------------------------------------------------
       Model |    Obs    ll(null)   ll(model)     df          AIC         BIC
-------------+---------------------------------------------------------------
           . |     74   -695.7129   -682.6073      4     1373.215    1382.431
-----------------------------------------------------------------------------
               Note:  N=Obs used in calculating BIC; see [R] BIC note

. expand 2
(74 observations created)

. quietly reg  price mpg trunk weight

. estat ic

-----------------------------------------------------------------------------
       Model |    Obs    ll(null)   ll(model)     df          AIC         BIC
-------------+---------------------------------------------------------------
           . |    148   -1391.426   -1365.215      4     2738.429    2750.418
-----------------------------------------------------------------------------
               Note:  N=Obs used in calculating BIC; see [R] BIC note

So, even if data are missing at random with your X variable, the smaller sample sizes that result from its inclusion will drive down the BIC and AIC stats quite a bit.


-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME:   (574)289-5227
EMAIL:  Richard.A.Williams.5@ND.Edu
WWW:    http://www.nd.edu/~rwilliam

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index