Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Re: correct confidence intervals of -mean- ?

From   Dirk Enzmann <>
Subject   st: Re: correct confidence intervals of -mean- ?
Date   Sat, 06 Mar 2010 15:01:10 +0100


it is not the s.e.(mean) that pose a problem here but the df used by invttest(). I could understand if -mean- would calculate a z-test where the df are irrelevant (although then the CIs would be not useful for small samples). But it calculates a t-test, and using the correct df is the essence of a t-test.


Kit Baum wrote:
> Dirk said
> Very carefully I want to ask: Are the confidence intervals given by
> -mean- really correct?
> Below I compare the results of -mean- with the results of a different
> procedure:
> and goes on to show that -mean- CIs can be reproduced by collapsing,
> but maintaining the DF in the confidence interval as that of the whole
> sample. These are the same standard errors of mean reported by
> tabstat price,by(rep78) stat(mean sd n semean)
> He wonders whether the DF used in calculating s.e.(mean) should be
> that of the full sample. I think that -mean- and -tabstat- are both
> using the notion that you have a model y = mu + \epsilon, where
> var(\epsilon} is a population parameter. Thus the variance of \epsilon
> is a constant for all subsamples, and when you calculate s.e. mean,
> you use the sqrt of that common variance and divide by the sqrt(sample
> size) of the subpopulation.  You can see that is being done by
> -tabstat- by comparing the sd, n and semean columns.
> What does surprise me is that the CIs generated by these methods
> differ so widely from those computed by
> reg price i.rep78
> margins rep78
> The differences are not just a small-sample/large-sample adjustment of
> the Root MSE. If you take apart the VCE of a regression of price on
> all five dummies, no constant term, you find a diagonal matrix
> containing the inverses of the respective sample sizes, so the
> difference has to lie in the computation of \hat{sigma^2} which
> multiplies inv(X'X).

Dr. Dirk Enzmann
Institute of Criminal Sciences
Dept. of Criminology
Schlueterstr. 28
D-20146 Hamburg

phone: +49-(0)40-42838.7498 (office)
       +49-(0)40-42838.4591 (Mrs Billon)
fax:   +49-(0)40-42838.2344
*   For searches and help try:

© Copyright 1996–2016 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index