Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: comparing logit models with large N


From   "Maarten Buis" <[email protected]>
To   <[email protected]>
Subject   st: RE: comparing logit models with large N
Date   Tue, 20 Dec 2005 11:33:36 +0100

Tomas:
If you have the entire population than significance levels are meaningless. It assumes that you are uncertain whether your estimates are equal to the population values due to sampling variability; since you have no sample, you are no longer uncertain about the population value. Whatever parameter you estimate is the parameter that occurs in the population. There are other sources of uncertainty, e.g. you are obviously uncertain about the proper model, but also the variables could be (and probably are) measured with error. However, frequentist inference (what you do if you look at "p-values") does not take this uncertainty into account. In other words, it hopes that this uncertainty is swamped by the uncertainty due to sampling variation, which is obviously not the case for you. In essence you have two options: 1) take a frequentist stance and claim you are certain, i.e. choose a model solely based on theory and just report the parameters without significance level, standard e
 rror, confidence interval, or 2) go Bayesian. Good and accessible places to start learning about Bayesian stats are Bolstad 2004 and Lancaster 2004. Unfortunately, you will probably have to use another stats package if you go Bayesian. R (http://www.r-project.org/) and WinBugs/OpenBugs (http://mathstat.helsinki.fi/openbugs/) are particularly popular among Bayesians.

HTH,
Maarten

William M. Bolstad (2004), "Introduction to Bayesian Statistics", Hoboken, NJ: Wiley
Tony Lancaster (2004), "An Introduction to Modern Bayesian Econometrics", Malden, MA: Blackwell Publishing

-----------------------------------------
Maarten L. Buis
Department of Social Research Methodology 
Vrije Universiteit Amsterdam 
Boelelaan 1081 
1081 HV Amsterdam 
The Netherlands

visiting adress:
Buitenveldertselaan 3 (Metropolitan), room Z214 

+31 20 5986715

http://home.fsw.vu.nl/m.buis/
-----------------------------------------

At dinsdag 20 december 2005 11:03 Thomas wrote:
> <snip> The problem is that if I add each new variable
 > (or each new interaction between two variables) in 
> model, it always significantly contributes to response
> variable and the fit of each complex model is always
> better than the previous (more parsimonious) one.
> (BIC is always lower, LR is always higer and D is
>  alway lower). I think that the problem is in large N.
> My data come from the whole population.  <snip>

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index