Home / Products / Stata 14 / Denominator degrees of freedom

Denominator degrees of freedom for mixed models were introduced in Stata 14.

See the latest version of denominator degrees of freedom for mixed models. See all of Stata's survey methods features.

See the new features in Stata 19.

Denominator degrees of freedom for mixed models

Highlights

Hypothesis tests and confidence intervals using t and F distributions
Five denominator-degrees-of-freedom (DDF) adjustments
- Kenward—Roger
- Satterthwaite
- ANOVA
- Repeated-measures ANOVA
- Residual
Small-sample inference for linear combinations
Small-sample inference for linear hypothesis tests
Small-sample inference for contrasts

What's this about?

Stata fits linear mixed-effects models and, until now, provided only large-sample inference based on normal and chi-squared distributions.

In small samples, the sampling distributions of test statistics are known to be t and F in simple cases, and those distributions can be good approximations in other cases. Stata 14 provides five methods for small-sample inference, also known as denominator-degrees-of-freedom (DDF) adjustments, including Satterthwaite and Kenward—Roger. In addition to adjusting the confidence intervals and significance tests reported by Stata's mixed estimation command, small-sample statistics are also provided for subsequent estimation of linear combinations and linear hypothesis tests of fixed effects.

Let's see it work

Consider a simple random-coefficient model for longitudinal data from Kenward and Roger (1997). There are 24 subjects, identified by the variable id. The subjects can be measured at any of nine time periods, but the outcome y is recorded at only three time periods for each subject, meaning that the subjects are not all seen at the same times.

To study both fixed and random effects of time, we fit the following mixed model using restricted maximum likelihood (REML) with the unstructured covariance between random effects:

. mixed y time || id: time, reml covariance(unstructured)

Performing EM optimization:

Performing gradient-based optimization:


Iteration 0:    log likelihood = -109.44372
Iteration 1:    log likelihood = -109.39161
Iteration 2:    log likelihood = -109.39153
Iteration 3:    log likelihood = -109.39153


Computing standard errors:

Mixed-effects REML regression                   Number of obs     =         72
Group variable: id                              Number of groups  =         24

						Obs per group:
							      min =          3
							      avg =        3.0
							      max =          3

						Wald chi2(1)      =       4.34
Log restricted-likelihood = -109.39153          Prob > chi2       =     0.0372


           y       Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]

        time    .2765987   .1327319     2.08   0.037     0.164489    .5367485
       _cons    1.045034   .2504823     4.17   0.000     .5540973     1.53597


  Random-effects Parameters     Estimate   Std. Err.     [95% Conf. Interval]

id: Unstructured             
                   var(time)    .3259698   .1356851      .1441665     .737039
                  var(_cons)    .4172514   .3432177      .0832198    2.092036
             cov(time,_cons)   -.1491218   .1736941      -.489556    .1913124

               var(Residual)    .3407946   .0844243      .2097135    .5538077

LR test vs. linear model: chi2(3) = 84.07                 Prob > chi2 = 0.0000

Note: LR test is conservative and provided only for reference.

Our default large-sample inference for time suggests that the fixed time effect is significant at a 5% level (p-value=0.037). Empirical evidence suggests, however, that in small samples, the normal and chi-squared distributions may provide poor approximations to the unknown distributions of the test statistics and may lead to anticonservative results.

In Stata 14, we can account for small samples by specifying one of the five DDF methods. We use the Kenward—Roger method in this example.

. mixed y time || id: time, reml covariance(unstructured) dfmethod(kroger)

Performing EM optimization:

Performing gradient-based optimization:


Iteration 0:    log likelihood = -109.44372
Iteration 1:    log likelihood = -109.39161
Iteration 2:    log likelihood = -109.39153
Iteration 3:    log likelihood = -109.39153


Computing standard errors:

Computing degrees of freedom:

Mixed-effects REML regression                   Number of obs     =         72
Group variable: id                              Number of groups  =         24

						Obs per group:
                                                              min =          3
                                                              avg =        3.0
                                                              max =          3
DF method: Kenward-Roger                        DF:           min =      11.68
                                                              avg =      17.19
                                                              max =      22.69

                                                F(1,    22.69)    =       4.24
Log restricted-likelihood = -109.39153          Prob > F          =     0.0512


           y       Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

        time    .2765987     .13434     2.06   0.051    -.0015158    .5547132
       _cons    1.045034   .2700712     3.87   0.002     .4548251    1.635242


  Random-effects Parameters     Estimate   Std. Err.     [95% Conf. Interval]

id: Unstructured             
                   var(time)    .3259698   .1356851      .1441665     .737039
                  var(_cons)    .4172514   .3432177      .0832198    2.092036
             cov(time,_cons)   -.1491218   .1736941      -.489556    .1913124

               var(Residual)    .3407946   .0844243      .2097135    .5538077

LR test vs. linear model: chi2(3) = 84.07                 Prob > chi2 = 0.0000

Note: LR test is conservative and provided only for reference.

After adjusting for a small sample, we do not have sufficient evidence to reject the null hypothesis of no time effect, at least at a 5% significance level.

Our follow-up analyses can also account for small samples, for example, when computing linear combinations,

. lincom _b[_cons] + _b[time], small

 ( 1)  [y]time + [y]_cons = 0


           y       Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

         (1)    1.321632   .2292508     5.77   0.000     .8235855    1.819679

and when performing linear hypothesis tests,

. test (_b[_cons]=1) (_b[time]==0), small

 ( 1)  [y]_cons = 1
 ( 2)  [y]time = 0

       F(  2, 15.60) =    3.05
            Prob > F =    0.0764

Reference

Kenward, M.G., and J.H. Roger. 1997. Small sample inference for fixed effects from restricted maximum likelihood.
Biometrics 53: 983-997.

Tell me more

Read more about small-sample adjustments in the Stata Multilevel Mixed-Effects Reference Manual, see [ME] Mixed

Read the overview from the Stata News.

Upgrade now Order Stata

Back to the highlights

Denominator degrees of freedom for mixed models were introduced in Stata 14.

See the latest version of denominator degrees of freedom for mixed models. See all of Stata's survey methods features.

See the new features in Stata 19.

Denominator degrees of freedom for mixed models

Denominator degrees of freedom for mixed models

Highlights

What's this about?

Let's see it work

Reference

Tell me more

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies


Iteration 0:	log likelihood = -109.44372
Iteration 1:	log likelihood = -109.39161
Iteration 2:	log likelihood = -109.39153
Iteration 3:	log likelihood = -109.39153


y		Coef. Std. Err. z P>\|z\| [95% Conf. Interval]

time		.2765987 .1327319 2.08 0.037 0.164489 .5367485
_cons		1.045034 .2504823 4.17 0.000 .5540973 1.53597


Random-effects Parameters		Estimate Std. Err. [95% Conf. Interval]

id: Unstructured
var(time)		.3259698 .1356851 .1441665 .737039
var(_cons)		.4172514 .3432177 .0832198 2.092036
cov(time,_cons)		-.1491218 .1736941 -.489556 .1913124

var(Residual)		.3407946 .0844243 .2097135 .5538077


y		Coef. Std. Err. t P>\|t\| [95% Conf. Interval]

time		.2765987 .13434 2.06 0.051 -.0015158 .5547132
_cons		1.045034 .2700712 3.87 0.002 .4548251 1.635242


y		Coef. Std. Err. t P>\|t\| [95% Conf. Interval]

(1)		1.321632 .2292508 5.77 0.000 .8235855 1.819679

Stata/MP4 Annual License (download)

Denominator degrees of freedom for mixed models were introduced in Stata 14. See the latest version of denominator degrees of freedom for mixed models. See all of Stata's survey methods features. See the new features in Stata 19.

Denominator degrees of freedom for mixed models

Denominator degrees of freedom for mixed models

Highlights

What's this about?

Let's see it work

Reference

Tell me more

We use cookies

Privacy policy

Required cookies

Advertising and performance cookies

Denominator degrees of freedom for mixed models were introduced in Stata 14.

See the latest version of denominator degrees of freedom for mixed models. See all of Stata's survey methods features.

See the new features in Stata 19.