[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: A methodological problem |

Date |
Thu, 30 Oct 2008 10:33:18 -0000 |

I think it's fair to say that few statistically-minded people would say exactly the same thing on this question and that such people can disagree strongly on it. I will make a few points, but would not be surprised at partial or complete dissent over any one. I also underline that a complete discussion would include many more points that arise directly or indirectly here, although several of them would be the same points approached or phrased differently. 1. This attitude makes more sense for simple descriptive measures than it does for the kind of modelling you intend. Thus (naively or not) you can say that with a complete population the observed mean is _the_ mean and no inferential issues arise. But even for simple regression once you postulate an error term then at least tacitly it has to have certain properties for any estimation method to work well and that kind of statement goes beyond the observable data. And even calculation of means is also arguably based on a similar model, regardless of whether the researcher knows that or makes it explicit. So, if you have an error term, that makes what you do inferential and not just descriptive, regardless of whether there are, or are not, more data out there that you might have collected. The point can be generalised to whatever probability distribution someone is working with. 2. Alternatively, any model you specify is likely to be incomplete in the sense that it does not capture all aspects of the process generating your data, e.g. all possible predictors, cluster or time or space structure, etc. So, there is an inferential aspect from that point of view as well. 3. What you are hoping is that results for a small sample [which happens to be the population] will behave like those from an arbitrarily large sample. But what mechanism makes that happen? Say I toss a coin 20 times, and then I lose it, so that there is no scope for taking any further measurements with that coin. Does that affect the variability of the data? In what sense does the sample know that it is as large as possible, and behave accordingly? I don't think it does. A sample of 20 is a sample of 20! 4. Sampling error is not the only kind of variability. There is also measurement error as well in most problems, although exceptionally perhaps not in many kinds of sports data. A more general point is that imagining what would be nice for your problem doesn't make it come true for your data. Nick n.j.cox@durham.ac.uk Carlo Amenta I am studying a sport team league with 20 teams in a specific years. I am using a 2sls estimator because of simultaneity problem with a specific variable which was confirmed using the -ivendog- procedure. At this stage the study is cross sectional and regards a specific season. It is correct to say that I have not any efficiency or consistency problem wuth the estimator considering the fact that I am studying the entire populationa and not a sample? As a matter of fact n=20 even if very small it is not the number of observation but all the teams in the league so the entire population. I think I have not to worry about any inference problem. Can someone confirm that or indicate any specific references? * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: A methodological problem***From:*"Carlo Amenta" <carlo.amenta@gmail.com>

- Prev by Date:
**RE: st: RE: p-value in stata???** - Next by Date:
**RE: st: RE: Syntax problems with maximum likelihood (method lf)** - Previous by thread:
**st: A methodological problem** - Next by thread:
**Re: st: A methodological problem** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |