Abstract
Discerning a possible dose-response pattern based on tables of empirical contrasts is difficult. Specification of a statistical model able to consider the possible dose-response data generating mechanism including its variation across studies is crucial for statistical inference. Aim of this presentation is to illustrate weighted mixed-effects dose-response models suitable for tables of correlated estimates. The command drmeta can be used with additive (mean difference) and multiplicative (odds ratios, hazard ratios) measures of association. The post-estimation command drmeta\_graph greatly facilitates the visualization of predicted average and study-specific dose-response relationships. Applications of the drmeta command are illustrated with regression splines in experimental and observational data based on non-linear and random-effects data generation mechanisms that can be encountered in health related sciences.
Data source: Web of Science
A weighted (linear) mixed-effects dose-response model (Stat Meth Med Res, 2019), also known as one-stage approach, can be specified as follows
$$ \begin{equation} \boldsymbol{\hat \gamma}_i = \mathbf{X}_{i} \boldsymbol{\beta} + \mathbf{Z}_{i} \mathbf{b}_i + \boldsymbol{\epsilon}_{i} \label{eq:dosresmodel} \end{equation} $$
$\boldsymbol{\hat \gamma}_i$ is the vector of empirical constrasts estimated in the $i$-th study
$\mathbf{X}_{i}$ is the design matrix for the fixed-effects $\boldsymbol{\beta}$
$\mathbf{Z}_{i}$ is the design matrix for the random-effects $\mathbf{b}_i$
$\mathbf{b}_i \sim \mathcal{N}\left(\mathbf{0}, \boldsymbol{\Psi} \right)$
The random-effects $\mathbf{b}_i$ represent study-specific deviations from the average dose-response coefficients $\boldsymbol{\beta}$.
The residual error term $\boldsymbol{\epsilon}_i \sim \mathcal{N}\left(\mathbf{0}, \mathbf{S}_i \right)$, whose variance matrix $\mathbf{S}_i$, the weight, is assumed known.
$\mathbf{S}_i$ can be either given or approximated using available data (Stata J, 2006; AJE, 2012, BMC Med Res Meth, 2016).
Interest in the constant change (linear trend) in average response associated with one unit increase in the dose
$\beta_1$ is the average linear trend
Interest in how the average response is smoothly varying according to the value of the dose
where $s_1$ and $s_2$ are restricted cubic spline functions (help mkspline) of the dose with three knots $\left(k_1, k_2, k_3\right)$, typically located at fixed percentiles of the dose distribution, the two splines are
$\beta_1$ is the average linear trend before the first knot $k_1$
$\beta_2$ is the departure from the average linear trend above the first knot $k_1$
Interest in the change of the linear trend beyond the dose value $k$
where the degree-1 linear spline $(x_{ij}>k)(x_{ij}-k)$, also written as $(x_{ij}-k)_{+}$, is activated when $x_{ij}>k$
$\beta_1$ is the average linear trend before the dose value $k$
$\beta_2$ is the change in average linear trend after the dose value $k$
$(\beta_1 + \beta_2)$ is the average linear trend after the dose value $k$
Orsini, N., and Spiegelman D. Meta-Analysis of Dose-Response Relationships. Chapter 18. Handbook of Meta-Analysis. Chapman & Hall/CRC Handbooks of Modern Statistical Methods. Editors CH Schmid, T. Stijnen, and I. White. 2020. Pages 395-428.
Details can be found in a forthcoming article on Stata Journal, 2021.
Let $\boldsymbol{x}^{*}$ be the design matrix obtained by applying the chosen dose-response function to a number of plausible values of the dose, and let $\boldsymbol{x}^{*}_{0}$ be the same design matrix evaluated at the chosen reference level.
The motivation for using $\boldsymbol{x}^{*}$, rather than the observed $\mathbf{X}_{i}$ used to fit the model, is that $\mathbf{X}_{i}$ does not contain enough data points for a smooth plot of the either average or study-specific dose-response relationship. The predicted average dose-response curve is given by
A pointwise $100(1-\alpha)$\% confidence intervals obtained as follows
An advantage of the mixed-effects model is to allow estimation of study-specific dose-response relationships.
The best linear unbiased prediction (BLUP) of $\mathbf{b}$ can be computed as
The conditional study-specific dose-response relationships are obtained by adding the fixed-effects and BLUPs as follows
Overlaying the average $\hat {\boldsymbol{\gamma}}^*$ and study-specific $\hat {\boldsymbol{\gamma}}^{*}_{i}$ allows visual appreciation of the central tendency and spread of dose-response relationships across studies.
Predictions can be easily obtained and plotted with the drmeta postestimation command predict and drmeta_graph
Just type (once) at Stata command line
ssc install drmeta
Tables of correlated empirical contrasts arising from heterogenous and possibly non-linear relationships investigated in experimental and observational studies.
The drmeta_graph command greatly facilitates the visualization of the estimated dose-response model.
drmeta_graph plots the average dose-response relationship together with confidence intervals upon indication of a list of dose/exposure values, a referent, and the types of transformations used to model the quantitative predictor.
It is particularly convenient when modelling the dose with splines.
The drmeta_gof command provides tools (deviance test, R-squared) to evaluate the goodness-of-fit in dose-response meta-analysis.
Further details
Discacciati A, Crippa A, Orsini N. Goodness of fit tools for dose-response meta-analysis of binary outcomes. Research Synthesis Methods. 2017 Jun;8(2):149-160.
A Monte-Carlo simulation study consisted of the following steps:
Let's generate studies arising from a non-linear random-effects dose-response mechanism where the average shape is
so, the lowest mean outcome for the average study is expected to be at
use http://www.stats4life.se/data/md_drm, clear
list id dose md , sepby(id)
set scheme _GRSTYLE_
twoway ///
(scatter md dose if se == 0, mc(blue%30)) ///
(scatter md dose [aw=1/se^2], mc(red%30)) ///
, ytitle("Mean Difference") legend(off)
twoway ///
(function dose = chi2den(5, x) , range(0 20) recast(area) lc(green%10) fc(green%10) yaxis(2) yscale(axis(2) off)) ///
(scatter md dose if se == 0, mc(blue%30)) ///
(scatter md dose [aw=1/se^2], sort msize(vsmall) mc(red%30) ) ///
(pcspike md mindose md maxdose if md != 0, lc(red%30) ) ///
, ytitle("Mean Difference") ///
legend(off) plotregion(style(none)) ///
xtitle("Dose") yscale(alt)
drmeta md dose, data(n sd) id(id) type(type_md) ml
Interpretation
Every 1 unit increase of the dose is estimated to increase, on average, the mean outcome by 0.3 units (95% CI = -0.41, 1.04). The estimated variance of the random-effects is large, suggesting a strong variability of the study-specific linear trends. Data appears to be compatible with a flat dose-response association (z=0.85, p-value=0.394).
Although we cannot exclude that for some studies, the dose-response relationship can be actually linear, data may also be compatible with the hypothesis of higher mean outcomes at the dose extremes relative to the middle range of the dose (i.e. U/J-shaped). Such shape would be unlikely to be discovered forcing a linear dose-response function in all the studies.
Therefore, we now allow a curvilinear relationship to be detected using restricted cubic splines with 3 knots of the dose.
mkspline doses = dose, nk(3) cubic displayknots
mat knots = r(knots)
drmeta md doses1 doses2 , se(semd) data(n sd) id(id) type(type_md) ml
Interpretation
The maximized log-likelihood of a mixed-effects model using the restricted cubic spline function (−49) of the dose is, in absolute terms, about two-fifths of the maximized log-likelihood of the the linear function (−118).
Even considering the higher number of estimated parameters of the spline function (2 coefficients + 3 variance/covariance of random-effects) relative the linear function (1 coefficient + 1 variance of random-effects), the AIC of the spline function (AICs = 107) is about a half the one of the linear function (AICl = 240).
Assuming the average dose-response function is truly linear, a Wald-type statistic being as or more extreme than observed would rarely occur ($H_0 : \beta_2 = 0$, $z$=5.24, $p$-value<0.001).
That said, we still have no idea of the possible shape relating the dose to the mean outcome.
drmeta_graph , matk(knots) dose(2(.5)10) ref(5) name(figure2, replace) ///
ytitle("Mean Difference") list yline(0, lp(dot) lc(black)) xlabel(2(1)10) ylabel(#7) ///
addplot(-2*(d-5)+.2*(d^2-25)) plotopts(lc(blue))
graph display
Figure Estimated summary (solid line) dose-response relationship with 95% confidence intervals (dashed lines) based on 10 tables of empirical estimates. Data were fitted with a weighted mixed-effects model using restricted cubic splines for the dose with three knots located at percentiles (10th, 50th, 90th) of the overall dose distribution. The blue line is the true summary dose-response relationship. The dose value of 5 units served as referent.
Let’s now consider tables of summarized data from 10 hypothetical observational prospective studies investigating the association between the quantitative dose (i.e. body mass index (BMI, $kg/m^2$) and the occurrence of a binary outcome (i.e. alive/death status during 10 years follow-up time).
Baseline age, a potential confounding variable, is associated with a higher mean BMI and is associated with higher mortality odds no matter what is the specific value taken by the BMI.
A mechanism of this type, commonly known as “confounding”, is likely to produce estimates of exposure effects apparently inconsistent.
Under a non-linear random-effect data generating mechanism , let's assume an age-adjusted odds ratio relating BMI to mortality of this form
(i.e. higher mortality at the extremes, particularly on the right tail, of the BMI distribution).
The lowest age-adjusted mortality risk can be expected to be at this level of BMI
use http://www.stats4life.se/data/or_drm, clear
list id bmi b seb case n , sepby(id)
drmeta b bmi, se(seb) data(n case) type(type) id(id) ml stddev
Interpretation
Every increment of 3 kg/m$^2$ in BMI is associated, on average, with a 48%, $e^{0.13(3)}$, higher age-adjusted mortality odds.
We next allowed a curvilinear relationship to be detected by using a restricted cubic spline function with three knots at fixed percentiles (10th, 50th, 90th) of the BMI distribution.
mkspline bmis = bmi, nk(3) cubic displayknots
mat knots = r(knots)
drmeta b bmis1 bmis2 , se(seb) data(n case) type(type) id(id) ml
drmeta_graph , matk(knots) dose(21(.2)28) list ref(24) ///
addplot(-2.3*(d-24)+0.05*(d^2-24^2)) ///
plotopts(lc(blue)) eform ///
ytitle("Adjusted Odds Ratio") ///
xtitle("Body Mass Index (kg/m{sup:2})") ///
name(fig4, replace) ylabel(1 1.2 1.5 2 3 4, angle(horiz))
graph display
Interpretation
The average dose-response relationship between BMI and age-adjusted mortality odds appear to be J-shaped with a possibile minimum at around 23.5-24 kg/m^2.
Let's now consider tables of summarized data from 30 hypothetical prospective cohort studies investigating the association between baseline brisk walking, measured in hours/week, and time until death, or end of follow-up (10 years), whichever came first.
Age is inversely associated with brisk walking and positively associated with higher mortality rates independently of brisk walking levels.
Furthermore, the true summary age-adjusted mortality hazard ratio is descreasing with higher walking levels with a threshold effect at 2 hours per week; that is
Principal investigators categorized brisk walking into 2/4 quantiles. Age-adjusted mortality hazard ratios comparing brisk walking categories were estimated using a Cox regression model.
Each set of empirical estimates arise from a random-effects data-generating mechanism.
use http://www.stats4life.se/data/hr_drm, clear
list, sepby(id)
// generate the degree-1 spline with a knot at 2
gen walkplus = (walk>2)*(walk-2)
drmeta b walk walkplus, se(seb) data(n case) type(type) id(id) ml
lincom walk*1/2, eform
lincom (walk+walkplus)*1/2, eform
Interpretation
Below 2 hours per week, every increment of 30 minutes per week in brisk walking is estimated to reduce the age-adjusted mortality rate by 21% (HR=$e^{-0.47(1/2)}=0.79$; 95\% CI = $e^{-.47(1/2)\pm 1.96(0.054)(1/2)} = 0.75-0.83$).
Above 2 hours per week, every increment of 30 minutes per week in brisk walking is estimated to increase, on average, the age-adjusted mortality rate by 4% (HR=$e^{(-0.47+0.54)(1/2)}=1.04$).
Statistical inference for a combination of parameters and data, such as $e^{(\beta_1+\beta_2)(1/2)}$, can be easily obtained with the lincom postestimation command.
drmeta_graph , eq(d (d>2)*(d-2) ) dose(0(.2)4) list ref(2) ///
addplot(-.5*(d-2)+.5*(d>2)*(d-2)) ///
plotopts(lc(black) lp(longdash)) ///
eform ///
ytitle("Adjusted Hazard Ratio") ///
xtitle("Brisk walking (hours/week)")
graph display
mat list e(blup20)
mat beta20 = e(b)+e(blup20)
mat list beta20
drmeta_graph , eq(d (d>2)*(d-2) ) dose(0(.2)4) list ref(2) gls noci ///
addplot(-.5*(d-2)+.5*(d>2)*(d-2)) ///
plotopts(lc(black) lp(longdash)) ///
ylabel(.5 1 2 4 8, angle(horiz)) eform ///
ytitle("Adjusted Hazard Ratio") ///
xtitle("Brisk walking (hours/week)")
graph display
Researchers, co-authors, and doctoral students who, over the years, contributed to the development of methods and procedures for dose-response meta-analysis.
Orsini, N. Weighted mixed-effects dose-response models for tables of correlated contrasts. Stata Journal. 2021. Forthcoming.
Orsini, N., and Spiegelman D. Meta-Analysis of Dose-Response Relationships. Chapter 18. Handbook of Meta-Analysis. Chapman and Hall/CRC, 2020. 395-428.
Crippa, A., Discacciati, A., Bottai, M., Spiegelman, D., Orsini, N (2019). One-stage dose–response meta-analysis for aggregated data. Statistical Methods in Medical Research, 28(5), 1579-1596.