 Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Ttest and Welch's degrees of freedom

 From Roger Newson To "statalist@hsphsun2.harvard.edu" Subject Re: st: Ttest and Welch's degrees of freedom Date Mon, 18 Apr 2011 13:49:22 +0100

```Sorry, the sentence:

In the case of an unequal-variance t-test, the parameter of
> interest is the difference between 2 sub-population means, and its
> sampling-variance variance estimator is the square root of the sum of
> the 2 squared standard errors of the 2 sample means.

should have been:

In the case of an unequal-variance t-test, the parameter of
> interest is the difference between 2 sub-population means, and its
> sampling-variance estimator is the sum of
> the 2 squared standard errors of the 2 sample means.

Sorry for any inconvenience caused.

Best wishes

Roger

Roger B Newson BSc MSc DPhil
Lecturer in Medical Statistics
Respiratory Epidemiology and Public Health Group
National Heart and Lung Institute
Imperial College London
Royal Brompton Campus
Room 33, Emmanuel Kaye Building
London SW3 6LR
UNITED KINGDOM
Tel: +44 (0)20 7352 8121 ext 3381
Fax: +44 (0)20 7351 8322
Email: r.newson@imperial.ac.uk
Web page: http://www.imperial.ac.uk/nhli/r.newson/
Departmental Web page:

Opinions expressed are those of the author, not of the institution.

On 18/04/2011 12:49, Roger Newson wrote:
```
```As far as I can see, there is no reason that the Welch degrees of
freedom (or even the Satterthwaite degrees of freedom) shouldn't be
greater than the homoskedastic (equal-variance) degrees of freedom,
which is (as Garry says) n1 + n2 - 2. Of course, this is not the case
most of the time, but (as Garry has shown) it is the case some of the time.

In statistical confidence interval formulas, the term "degrees of
freedom" is a shorthand for "twice the inverse-squared coefficient of
variation of the variance estimator itself", where the "variance
estimator" is the estimated sampling variance of the estimated
parameter. In the case of an unequal-variance t-test, the parameter of
interest is the difference between 2 sub-population means, and its
sampling-variance variance estimator is the square root of the sum of
the 2 squared standard errors of the 2 sample means. This
sampling-variance estimator is itself subject to sampling variation,
which is why we use the t-distribution instead of the Normal
distribution to calculate confidence limits. IF the 2 sub-population
variances are equal, THEN you can use the equal-variance standard error
for the difference between 2 means, which works by using the sample
variance of the larger sample to estimate the sub-population variance of
the smaller sample. And, IF the 2 sub-population variances are equal,
THEN, by definition, this is a reasonable thing to do. And, IF the 2
subpopulation variances are equal, THEN the equal-variance standard
error for the difference between the 2 means will be subject (at least
asymptotically) to less sampling-variation than the unequal-variance
standard error for the difference between the 2 means, and therefore
will be allowed to have more degrees of freedom. HOWEVER, IF the 2
sub-population variances are unequal, THEN the equal-variance standard
error of the difference between the 2 means will be biassed anyway, and
may or may not be subject to less sampling variability than the
unequal-variance standard error of the difference between the 2 means.

I hope this helps.

Best wishes

Roger

Roger B Newson BSc MSc DPhil
Lecturer in Medical Statistics
Respiratory Epidemiology and Public Health Group
National Heart and Lung Institute
Imperial College London
Royal Brompton Campus
Room 33, Emmanuel Kaye Building
London SW3 6LR
UNITED KINGDOM
Tel: +44 (0)20 7352 8121 ext 3381
Fax: +44 (0)20 7351 8322
Email: r.newson@imperial.ac.uk
Web page: http://www.imperial.ac.uk/nhli/r.newson/
Departmental Web page:

Opinions expressed are those of the author, not of the institution.

On 18/04/2011 09:37, Garry Anderson wrote:
```
```Dear Statalist,

I was reading the -ttest- entry in the manual on page 1998 (example 3)
and noticed that use of Welch's degrees of freedom can increase the
degrees of freedom compared with the usual degrees of freedom obtained
from an unpaired t-test.

Should Welch's degrees of freedom be larger than n1 + n2 - 2 ?

The commands and output are shown below.

. use http://www.stata-press.com/data/r11/fuel3

. ttest mpg, by(treated)

Two-sample t test with equal variances
------------------------------------------------------------------------
------
Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf.
Interval]
---------+--------------------------------------------------------------
------
0 |      12          21    .7881701    2.730301    19.26525
22.73475
1 |      12       22.75    .9384465    3.250874    20.68449
24.81551
---------+--------------------------------------------------------------
------
combined |      24      21.875    .6264476    3.068954    20.57909
23.17091
---------+--------------------------------------------------------------
------
diff |               -1.75    1.225518               -4.291568
.7915684
------------------------------------------------------------------------
------
diff = mean(0) - mean(1)                                      t =
-1.4280
Ho: diff = 0                                     degrees of freedom =
22

Ha: diff<   0                 Ha: diff != 0                 Ha: diff
```
```0
```
```   Pr(T<   t) = 0.0837         Pr(|T|>   |t|) = 0.1673          Pr(T>   t) =
0.9163

. ttest mpg, by(treated) welch

Two-sample t test with unequal variances
------------------------------------------------------------------------
------
Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf.
Interval]
---------+--------------------------------------------------------------
------
0 |      12          21    .7881701    2.730301    19.26525
22.73475
1 |      12       22.75    .9384465    3.250874    20.68449
24.81551
---------+--------------------------------------------------------------
------
combined |      24      21.875    .6264476    3.068954    20.57909
23.17091
---------+--------------------------------------------------------------
------
diff |               -1.75    1.225518                -4.28369
.7836902
------------------------------------------------------------------------
------
diff = mean(0) - mean(1)                                      t =
-1.4280
Ho: diff = 0                             Welch's degrees of freedom =
23.2465

Ha: diff<   0                 Ha: diff != 0                 Ha: diff
```
```0
```
```   Pr(T<   t) = 0.0833         Pr(|T|>   |t|) = 0.1666          Pr(T>   t) =
0.9167

.

Kind regards, Garry

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```
```*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```
```*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```