Home  /  Stata News  /  Vol 34 No 5  /  Forest plots
The Stata News

«Back to main page

In the spotlight: Customized forest plots for displaying meta-analysis results

Suppose you have a set of studies addressing whether coffee is good or bad for you. Some studies state that it is good, some state that it is bad, and others state that it has no effect on your health. Meta-analysis synthesizes results from the individual studies to help you figure out which of these statements is true, assuming there is a true one. Otherwise, meta-analysis will focus on investigating reasons behind the discrepencies in the reported effects; perhaps the benefits of coffee depend on the country of origin and storage time.

Stata 16's new meta suite can handle all the steps required to perform a meta-analysis. Here I will give you a brief preview of the meta suite, but my main goal is to demonstrate how you can use some new features from the recent update that allow you to customize your forest plots in various ways and highlight results of interest in your meta-analysis.

To demonstrate, I use a dataset from Colditz et al. (1994) of clinical trials that study the efficacy of a Bacillus Calmette-Gúerin (BCG) vaccine in the prevention of tuberculosis (TB).

. use bcg

The dataset includes variables that record, for each study, the number of TB positive cases in the treated group (npost), the number of TB negative cases in the treated group (nnegt), the number of TB positive cases in the control group (nposc), and the number of TB negative cases in the control group (nnegc). The effects of interest are log risk-ratios. To compute log risk-ratios and declare them as the effect size of choice for the meta-analysis, I type

. meta esize npost nnegt nposc nnegc, esize(lnrratio) studylabel(studylbl)

I also specify that the variable studylbl, which contains the authors' names and year of publication, be used to label the studies.

I can now display the results of the meta-analysis using a forest plot.

. meta forestplot

In the graph, each study corresponds to a navy square centered at the point estimate of the effect size, with a horizontal line (whiskers) extending on either side of the square, representing the 95% confidence interval of the point estimate. The area of the square is proportional to the corresponding study weight. Alternatively, I could have used meta summarize to display results numerically in a table.

In the forest plot above, we can scan vertically from the center of a box down to the overall effect size diamond or to the 0 value representing no effect to determine how an individual study compares with these values.

Below, I will draw vertical reference lines, using options esrefline and nullrefline, to highlight the study-specific deviations from the overall effect-size and no-effect values, respectively. I also use the insidemarker option to insert a marker (orange circle by default) at the center of each study marker (navy-blue squares) to indicate the study-specific effect sizes. This marker is particularly useful for less precise studies (with larger squares).

. meta forestplot, esrefline nullrefline insidemarker

While the forest plot above allows us to easily spot differences from the null value of 0, the interpretation of the differences is not clear. To solve this, we can annotate which side of the plot corresponds to effect sizes that favor the treatment versus those that favor the control. Below, I specify the favorsleft() and favorsright() suboptions of the nullrefline() option to annotate sides of the plot with respect to the no-effect line.

. meta forestplot, nullrefline(favorsleft("Favors vaccine", color(green)) 
      favorsright("Favors control"))

We can now easily see that the overall effect size and the effect size estimated in most of the individual studies favor the treatment.

In the forest plot above, the width of the diamond represents a confidence interval for the overall effect size. I would also like to show the corresponding prediction interval. By specifying option predinterval, I will display the prediction interval whiskers extending from the overall effect-size diamond, where the width of the whiskers spans the width of the prediction interval.

. meta forestplot, predinterval

The prediction interval, represented by the green whiskers extending from the overall diamond, provides a plausible range for the effect size in a future, new study.

How else might I customize my forest plot? What if I believe that effect size is related to the location of the study? To more easily evaluate this, I can add a column reporting the latitude. I will also specify option sort(latitude) to sort our results based on the ascending order of study latitudes. In the specification below, I omit some columns (_data and _weight) from the default forest plot. I construct a forest plot showing only columns for the study labels, the plot, the effect sizes and their confidence intervals, and the variable latitude. Note that the columns appear in the forest plot in the order they were specified.

. meta forestplot _id _plot _esci latitude, 
        columnopts(latitude, title("Absolute latitude")) sort(latitude)

In general, studies with smaller effect sizes (favoring treatment) appear to have been performed in locations farther from the equator. We could use meta regress to investigate this relation more formally, but I will not demonstrate this here.

The final customization that I will demonstrate places multiple overall effect sizes on the same forest plot. For example, I might want to compare the estimates of overall effect size based on the random-effects DerSimonian–Laird model, the common-effect inverse-variance model, and the default random-effects REML model. I can do this via the customoverall() option. First, I obtain the numerical values of the overall effect size and its confidence interval based on the DerSimonian–Laird and common-effect models as follows:

. meta summarize, random(dlaird) nostudies

  Effect-size label:  Log Risk-Ratio
        Effect size:  _meta_es
          Std. Err.:  _meta_se
        Study label:  studylbl

Meta-analysis summary                               Number of studies =     13
Random-effects model                                Heterogeneity:
Method: DerSimonian-Laird                                       tau2 =  0.3088
                                                              I2 (%) =   92.12
                                                                  H2 =   12.69

        theta: Overall Log Risk-Ratio
Coef. Std. Err. z P>|z| [95% Conf. Interval]
theta -.7141172 .1787421 -4.00 0.000 -1.064445 -.3637892
Test of homogeneity: Q = chi2(12) = 152.23 Prob > Q = 0.0000 . meta summarize, common(iv) nostudies Effect-size label: Log Risk-Ratio Effect size: _meta_es Std. Err.: _meta_se Study label: studylbl Meta-analysis summary Number of studies = 13 Common-effect model Method: Inverse-variance theta: Overall Log Risk-Ratio
Coef. Std. Err. z P>|z| [95% Conf. Interval]
theta -.4302852 .0404988 -10.62 0.000 -.5096613 -.3509091

Option nostudies was used to suppress results from individual studies because they are not relevant here. I then specify the values from meta summarize output in the customoverall() option.

. meta forestplot,                                                           
        customoverall(-0.714 -1.064 -0.364, label("{bf:Random-DL model}")) 
        customoverall(-0.430 -0.510 -0.351, label("{bf:Common-IV model}"))

The default REML-based estimates are similar to the DerSimonian–Laird estimates, while the effect size from the common-effect model was smaller, with a tighter confidence interval.

Prefer to point and click instead of typing commands to create your forest plots? No worries. Everything I have shown above, as well as all of meta's features, can also be accessed using Stata's menu and dialog boxes. You may also use the Graph editor to customize your forest plots.

For other examples of forest plots such as subgroup and cumulative forest plots, see [META] meta forestplot. Other meta commands that were not mentioned above are meta funnel, meta bias, and meta trimfill for investigating small-study effects. Also, bubble plots and L'Abbe plots may be constructed via commands estat bubbleplot and meta labbeplot, respectively. If you would like to learn more about meta-analysis in Stata, you can go here for examples and one possible workflow. You can also search the full list of features or look through the new Stata Meta-Analysis Reference Manual.

— Houssein Assaad
Senior Statistician and Software Developer


  • Colditz, G. A., T. F. Brewer, C. S. Berkey, M. E. Wilson, E. Burdick, H. V. Fineberg, and F. Mosteller. 1994. Efficacy of BCG vaccine in the prevention of tuberculosis: Meta-analysis of the published literature. Journal of the American Medical Association 271: 698-702.