[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: when your sample is the entire population

From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   st: RE: when your sample is the entire population
Date   Fri, 18 Jan 2008 20:02:54 -0000

I guess most people will have a short answer and a long answer 
to this one. You are going to get my short answer. 

Also, in statistical science, it seems that most people who think they
have a reasonably smart, or at least sensible, answer think some of the
other guys' reasonably smart answers are really fairly stupid, or at
least difficult to understand. So it may be colourful if and when people
start telling me that after a few decades of sweat and toil I _still_
don't understand statistics at all. 

If the question is what meaning is attached to a P-value, then there 
seem many possible partial answers. 

1. I am looking only at a sample of size n and I think of this as only
one of many possible samples of the same size from a larger population.
That is most plausible if someone really did select that sample using
random numbers, or something equivalent, and it's a greater or lesser
stretch otherwise. In many cases the sample you have just fell into your
lap somehow 
and the whole exercise is to treat the data _as if_ it were a random
sample, partly because that's a calculation you can do. There's usually
some wishful thinking involved. Both texts and teachers vary enormously
on how candidly they discuss what is going on. This seems to be what is
most emphasised in most introductory courses and texts, but it may be
the least applicable story in statistical practice! 

2. I am looking at a sample of size n and I am willing to think of this
as one possible outcome among many. I can get a reference population by 
resampling the data I have repeatedly. Permutation and bootstrap methods
fit under this heading. I think it wry that in less than 30 years
bootstrap methods have gone from being widely regarded as a form of
cheating to being widely considered as the best way to get a P-value in
many problems. 

3. I have a model, at its simplest response a function of predictors
plus some error term, and the uncertainty comes from the fact that the
model is always a approximation and stochastic by virtue of its error
term. Whether your n is the whole N is immaterial, because the
uncertainty is not about sampling at all. 

4. What I have I regard as the realisation of a stochastic process
(usually in time, or space, or both). The realisation is unique, but at
least in principle there could have been other realisations. 

I won't quarrel with anyone who thinks #3 and #4 sound the same. 

5. Bayesians have other stories. 

6. I must have forgotten or be unaware of yet other stories. Bill Gould
has tried to explain quantum mechanics to me several times. I am pretty
clear that he understands it very well. 

In these terms you seem to be saying #1 does not apply in your case, but

that still leaves other arguments, and there is a lot of scope for
arguing what is central to #1 in any case. 

[email protected] 

-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Lloyd Dumont
Sent: 18 January 2008 18:00
To: [email protected]
Subject: st: when your sample is the entire population

Hello, everyone.  I am facing a statistical
"challenge" that must be commonplace as microdata
becomes more and more accessible.  I have been
estimating  models using xtreg, as I have people
coming and going monthly over about a two year period.
 Some estimates significant, others not.

But, if the people in the "sample" are the entire
population that I am inferring to, conventional
measures of significance seem inapprpriate.  But, I
have never read any social science that presents
regression estimates, and then says something along
the lines of, "These are what they are.  Significance
doesn't apply here."

-Am I roughly correct?
-Is there some other measure of "certainty" that might
be informative in these situations?
-Is there a name for this type of "sample" or
estimation issue?  Googling it has been a real
challenge, though there must be lots of
writing/commenting on this matter.

Thanks for your thoughts.  Lloyd Dumont

*   For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index