Home  /  Products  /  Stata 18  /  Multilevel meta-analysis

<- See Stata 18's new features

Highlights

• Multilevel meta-analysis and meta-regression

• Multiple levels of hierarchy

• Random intercepts and slopes

• Random-effects covariance structures

• Sensitivity analysis

• REML and ML estimation methods

• Multilevel Q statistic and test

• Heterogeneity

• Cochran's multilevel $$I^2$$ statistic

• Higgins–Thompson multilevel $$I^2$$ statistics

• Postestimation

• Prediction of random effects

• Comparative and diagnostic standard errors

• Variance–covariance matrix of random effects

• Residuals

• Standardized residuals

You want to analyze results from multiple studies, in which the reported effect sizes are nested within higher-level groupings such as regions or schools. Stata 18 adds two new commands, meta meregress and meta multilevel, to the meta suite to perform multilevel meta-analysis and meta-regression. Include random intercepts and coefficients at different levels of hierarchy, and assume different random-effects covariance structures, including exchangeable and unstructured. Perform sensitivity analysis by placing various constraints on variance components. Assess heterogeneity. Predict random effects and their comparative and diagnostic standard errors. And more.

Multilevel meta-analysis is a powerful statistical tool to synthesize effect sizes with a hierarchical structure, such as in a meta-analysis exploring the impact of a new teaching technique on math testing scores in different school districts. Here the effect sizes are nested within schools that are themselves nested within districts. Multilevel meta-analysis allows us not only to determine the overall effect of the technique but also to assess the variability among the effect sizes at different levels of the hierarchy. This is important because studies within the same district are likely to be similar and thus potentially dependent, and ignoring this dependence may lead to inaccurate results. By properly accounting for the dependence among the effect sizes, we can produce more accurate inference and gain a better understanding of the impact of the teaching technique.

#### Let's see it work

-> Example dataset: Modified school calendar data

-> Multilevel meta-analysis: Constant-only model

-> Multilevel heterogeneity

-> Multilevel meta-regression and random slopes: Incorporating moderators

-> Random-effects covariance structures

-> Predicting random effects

##### Example dataset: Modified school calendar data

Many studies suggest that the long summer break at the end of a school year is linked to a learning gap between students because of students' differential access to learning opportunities in the summer.

Cooper, Valentine, and Melson (2003) conducted a multilevel meta-analysis on schools that modified their calendars without prolonging the school year. The dataset consists of 56 studies that were conducted in 11 school districts. Some schools adopted modified calendars that featured shorter breaks more frequently throughout the year (for example, 12 weeks of school followed by 4 weeks off) as opposed to the traditional calendar with a longer summer break and shorter winter and spring breaks. The studies compared the academic achievement of students on a traditional calendar with those on a modified calendar. The effect size (stdmdiff) is the standardized mean difference, with positive values indicating higher achievement, on average, in the group on the modified calendar. The standard error (se) of stdmdiff was also reported by each study. Let's first describe our dataset:

. webuse schoolcal
(Effect of modified school calendar on student achievement)

. describe

Contains data from https://www.stata-press.com/data/r18/schoolcal.dta
Observations:            56                  Effect of modified school calendar on student
achievement
Variables:             8                  19 Jan 2023 21:44
(_dta has notes)

Variable      Storage   Display    Value
name         type    format    label      Variable label

district        int     %12.0g                District ID
school          byte    %9.0g                 School ID
study           byte    %12.0g                Study ID
stdmdiff        double  %10.0g                Standardized difference in means of achievement
test scores

var             double  %10.0g                Within-study variance of stdmdiff
year            int     %12.0g                Year of the study
se              double  %10.0g                Within-study standard-error of stdmdiff
year_c          byte    %9.0g                 Year of the study centered around 1990

Sorted by: district
##### Multilevel meta-analysis: Constant-only model

Because the schools are nested within districts, we fit a three-level random-intercepts model. This requires that we specify two random-effects equations: one for level 3 (identified by variable district) and one for level 2 (identified by variable school).

. meta meregress stdmdiff || district: || school: , essevariable(se)

Performing EM optimization ...

Iteration 0:  Log restricted-likelihood =  -104.8525  (not concave)
Iteration 1:  Log restricted-likelihood = -46.670529  (not concave)
Iteration 2:  Log restricted-likelihood = -22.871266  (not concave)
Iteration 3:  Log restricted-likelihood = -12.977299
Iteration 4:  Log restricted-likelihood = -7.9642885
Iteration 5:  Log restricted-likelihood = -7.9587271
Iteration 6:  Log restricted-likelihood = -7.9587239
Iteration 7:  Log restricted-likelihood = -7.9587239

Computing standard errors ...

Multilevel REML meta-analysis                               Number of obs = 56

Grouping information

No. of       Observations per group
Group variable      groups    Minimum    Average    Maximum

district           11          3        5.1         11
school           56          1        1.0          1

Wald chi2(0)  =  .
Log restricted-likelihood = -7.9587239                      Prob > chi2   =  .

stdmdiff   Coefficient  Std. err.      z    P>|z|     [95% conf. interval]

_cons     .1847132   .0845559     2.18   0.029     .0189866    .3504397

Test of homogeneity: Q_M = chi2(55) = 578.86               Prob > Q_M = 0.0000

Random-effects parameters      Estimate

district: Identity
sd(_cons)     .2550724

school: Identity
sd(_cons)     .1809324



The first table displays information on groups at different levels of hierarchy with one row for each grouping (level of hierarchy).

The second table displays the fixed-effects coefficients. Here there is only an intercept corresponding to the overall effect size $$\hat{\theta}$$. The value of $$\theta$$ is 0.185 with a 95% CI of [0.019, 0.35]. This means that, on average, students following the modified school calendar achieved higher scores than those who did not.

The third table displays the random-effects parameters, traditionally known as variance components in the context of multilevel or mixed-effects models. The variance-component estimates are now organized and labeled according to each level. By default, meta meregress reports standard deviations of the random intercepts (and correlations if they existed in the model) at each level. But you can instead specify the variance option to report variances (and covariances if they existed in the model). We have $$\hat{\tau_3} = 0.255$$ and $$\hat{\tau_2} = 0.181$$. These values are the building blocks for assessing heterogeneity across different hierarchical levels and are typically interpreted in that context. In general, the higher the value of $$\tau_l$$, the more heterogeneity is expected among the groups within level $$l$$.

Alternatively, this can be specified using the meta multilevel command as follows:

. meta multilevel stdmdiff, relevels(district school) essevariable(se)
(output omitted)


The meta multilevel command was designed to fit random-intercepts meta-regression models, which are commonly used in practice. It is a convenience wrapper for meta meregress.

##### Multilevel heterogeneity

We will use the postestimation command estat heterogeneity to quantify the multilevel heterogeneity among the effect sizes.

. estat heterogeneity

Method: Cochran
Joint:
I2 (%) = 90.50

Method: Higgins–Thompson
district:
I2 (%) = 63.32

school:
I2 (%) = 31.86

Total:
I2 (%) = 95.19


Cochran’s $$I^2$$ quantifies the amount of heterogeneity jointly for all levels of hierarchy. $$I^2 = 90.50\%$$ means that 90.50% of the variability among the effect sizes is due to true heterogeneity in our data as opposed to the sampling variability.

The multilevel Higgins–Thompson $$I^2$$ statistics assess the contribution of each level of hierarchy to the total heterogeneity, in addition to their joint contribution. For example, between-schools heterogeneity or heterogeneity within districts (level 2 heterogeneity) is the lowest, accounting for about 32% of the total variation in our data, whereas between-districts heterogeneity (level 3 heterogeneity) accounts for about 63% of the total variation.

##### Multilevel meta-regression and random slopes: Incorporating moderators

We will use variable year_c to conduct a three-level meta-regression and include random slopes (corresponding to variable year_c) at the district level.

. meta meregress stdmdiff year_c || district: year_c || school: , essevariable(se)

Performing EM optimization ...

Iteration 0:  Log restricted-likelihood = -101.95646  (not concave)
Iteration 1:  Log restricted-likelihood = -94.528133  (not concave)
Iteration 2:  Log restricted-likelihood = -29.169697  (not concave)
Iteration 3:  Log restricted-likelihood =  -10.67081  (not concave)
Iteration 4:  Log restricted-likelihood = -7.5089434  (not concave)
Iteration 5:  Log restricted-likelihood = -7.2219899
Iteration 6:  Log restricted-likelihood = -7.2085474  (not concave)
Iteration 7:  Log restricted-likelihood = -7.2082538  (not concave)
Iteration 8:  Log restricted-likelihood = -7.2079523  (not concave)
Iteration 9:  Log restricted-likelihood = -7.2073687  (not concave)
Iteration 10: Log restricted-likelihood = -7.2067537  (not concave)
Iteration 11: Log restricted-likelihood = -7.1989783
Iteration 12: Log restricted-likelihood = -7.1891619
Iteration 13: Log restricted-likelihood = -7.1815206
Iteration 14: Log restricted-likelihood = -7.1813888
Iteration 15: Log restricted-likelihood = -7.1813887

Computing standard errors ...

Multilevel REML meta-regression                         Number of obs =     56

Grouping information

No. of       Observations per group
Group variable      groups    Minimum    Average    Maximum

district           11          3        5.1         11
school           56          1        1.0          1

Wald chi2(1)  =   0.31
Log restricted-likelihood = -7.1813887                  Prob > chi2   = 0.5753

stdmdiff   Coefficient  Std. err.      z    P>|z|     [95% conf. interval]

year_c     .0059598   .0106378     0.56   0.575    -.0148899    .0268094
_cons     .1805809   .0904865     2.00   0.046     .0032306    .3579311

Test of homogeneity: Q_M = chi2(54) = 550.26               Prob > Q_M = 0.0000

Random-effects parameters      Estimate

district: Independent
sd(year_c)     .0177247
sd(_cons)      .219239

school: Identity
sd(_cons)     .1807703



The estimate of the regression coefficient of variable year_c is 0.006 with a 95% CI of [–0.015, 0.027] . We do not see any evidence for the association between stdmdiff and year_c (p = 0.575).

##### Random-effects covariance structures

Although year_c did not explain the heterogeneity, we continue to include it as a moderator for illustration purposes.

By default, the random slopes and random intercepts (at the district level) are assumed independent. Alternatively, we can specify an exchangeable covariance structure using option covariance(exchangeable). We suppress the header and the iteration log and display results with three decimal points using the noheader, nolog, and cformat(%9.3f) options.

. meta meregress stdmdiff year_c || district: year_c, covariance(exchangeable)
|| school:, essevariable(se) noheader nolog cformat(%9.3f)

stdmdiff   Coefficient  Std. err.      z    P>|z|     [95% conf. interval]

year_c        0.010      0.012     0.80   0.426       -0.014       0.033
_cons        0.153      0.074     2.06   0.039        0.008       0.298

Test of homogeneity: Q_M = chi2(54) = 550.26               Prob > Q_M = 0.0000

Random-effects parameters      Estimate

district: Exchangeable
sd(year_c _cons)        0.032
corr(year_c,_cons)        1.000

school: Identity
sd(_cons)        0.181



Alternatively, we can specify a custom covariance structure by fixing the correlation between the intercepts and slopes at 0.5 and allowing for their standard deviations to be estimated from the data:

. matrix A = (.,.5 \ .5,.)
. meta meregress stdmdiff year_c || district: year_c, covariance(custom A)
|| school:, essevariable(se) noheader nolog cformat(%9.3f)

stdmdiff   Coefficient  Std. err.      z    P>|z|     [95% conf. interval]

year_c        0.007      0.011     0.67   0.500       -0.014       0.028
_cons        0.170      0.082     2.08   0.038        0.010       0.330

Test of homogeneity: Q_M = chi2(54) = 550.26               Prob > Q_M = 0.0000

Random-effects parameters      Estimate

district: Custom
sd(year_c)        0.026
sd(_cons)        0.116
corr(year_c,_cons)        0.500*

school: Identity
sd(_cons)        0.180

(*) fixed during estimation
##### Predicting random effects

Below, we predict the random effects using predict, reffects and obtain their diagnostic standard errors by specifying the reses(, diagnostic) option.

. quietly meta meregress stdmdiff || district: || school: , essevariable(se)
. predict double u3 u2, reffects reses(se_u3 se_u2, diagnostic)
. by district, sort: generate tolist = (_n==1)
. list district u3 se_u3 if tolist

district           u3       se_u3

1.        11   -.18998595   .07071817
5.        12   -.08467077   .13168501
9.        18     .1407273   .11790486
12.        27    .24064814   .13641505
16.        56    -.1072942   .13633364

20.        58   -.23650899   .15003184
31.        71     .5342778   .12606073
34.        86    -.2004695    .1499012
42.        91    .05711692   .14284823
48.       108   -.14168396   .13094894

53.       644   -.01215679   .10054689



The random intercepts u3 are district-specific deviations from the overall mean effect size. For example, for district 18, the predicted standardized mean difference is 0.14 higher than the overall effect size.

#### Reference

Cooper, H., Valentine, J. C., and Melson, A. 2003. The effects of modified school calendars on student achievement and on school and community attitudes. Review of Educational Research 73: 1–52.

#### Tell me more  