Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Explaining the Use of Inferential Statistics Even Though I Have Population Data


From   sjsamuels@gmail.com
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Explaining the Use of Inferential Statistics Even Though I Have Population Data
Date   Sun, 31 May 2009 11:18:01 -0400

Antonio,

To say that you have the entire population is to imply that all
standard errors should be zero (finite population correction =1).
This is not appropriate for measures of association.  Almost never do
you want to estimate a measure of association for a finite population
and test its significance (It is always significant and never zero.)
. Quote the references to show that finite population correction
should be used only for descriptive studies.

 There are (at least) two related points of view.  One is that tthe
population members are drawn from  a hypothetical super-population.
The other, alluded to by David, is that the outcomes were the result
of a random process applied to each member of the population,
conditional on covariates.. Inference is based on variation in the
random process.  See David B rillinger's Biometrics paper "The natural
variability of vital rates and associated statistics"  Pp. 693-734 in
Biometrics 42 (1986).
 at http://www.stat.berkeley.edu/~brill/Papers/biometrics.pdf.  I
think that this version may be more acceptable to the reviewer who
raised the issue.


-Steve

On Sat, May 30, 2009 at 10:06 PM,  <sjsamuels@gmail.com> wrote:
> Model uncertainty and "sampling error" (except in a very special sense
> involving super-populations) are also not reasons for using
> inferential statistics with population data.  None of the reasons you
> offer are present in the references I cited in
> http://www.stata.com/statalist/archive/2009-02/msg00806.html.  You
> would be well-off to cite one or more these and paraphrase the
> arguments they contain...
>
> -Steve
>
>
> On Fri, May 29, 2009 at 5:25 PM, David Greenberg <dg4@nyu.edu> wrote:
>> I think that your statement below makes two errors of statistical logic. Inferential statistics has nothing to do with omitted variable bias. Nor will it take measurement error in your variables into account. There is a debate going back into the 1960s (if not earlier) as to whether it makes sense to use inferential statistics when you have population data. In your case, where you are estimating Poisson regressions, you are positing a random process, and you might reasonably argue that this justifies the use of inferential statistics. David Greenberg, Sociology Department, New York University
>>
>> ----- Original Message -----
>> From: Antonio Silva <asilva100@live.com>
>> Date: Friday, May 29, 2009 5:08 pm
>> Subject: st: Explaining the Use of Inferential Statistics Even Though I Have Population Data
>> To: Stata list <statalist@hsphsun2.harvard.edu>
>>
>>
>>> Dear Statalisters:
>>> I am revising an article for publication. I have data for the entire
>>> population I am studying. Nonetheless, I employ inferential
>>> statistics. Specifically, to analyze the data I used a multivariate
>>> Poisson regression model (several actually) for hypothesis testing.
>>> One of the reviewers asked the obvious question: Why did you use
>>> inferential statistics when you have data for the entire population? I
>>> have read discussions about this topic previously on this list, and I
>>> have a pretty clear idea in my head of why using inferential
>>> statistics still makes sense when your sample is the entire
>>> population.
>>> I am writing now with a simple question: Do you think an explanation
>>> that reads like this is appropriate and/or sufficient to deal with the
>>> reviewer’s point?
>>>
>>> “As I mention above, the data we utilize here come from the full
>>> population under study rather than a sample of the population. This,
>>> of course, begs the following question: Why do we use inferential
>>> statistics? Our answer is twofold. First, the models we estimate
>>> subsequently almost certainly are imperfect in the sense that they do
>>> not contain all possible predictors. In other words, our models are by
>>> definition approximations of reality rather than complete and accurate
>>> representations of reality. This makes inferential statistics
>>> appropriate. Second, inferential statistics account for error. Perhaps
>>> most important, they account for sampling error. But they also account
>>> for measurement error, which almost surely is present here (as it is
>>> everywhere in the social sciences). In short, though our data come
>>> from the entire population rather than a sample, we believe that
>>> inferential statistics are appropriate for our purposes.”
>>>
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index