Join us for the 2026 Stata Biostatistics and Epidemiology Virtual Symposium, a meeting of researchers in biostatistics and epidemiology from around the world discussing current theory and applied methods using Stata. The program consists of invited talks by top Stata users, and the virtual platform allows you to experience this one-day event from wherever you are.
Seats will be limited. Register now.
Enjoy insightful and informative presentations by these experienced Stata users in the field.
All times Central Standard Time
10:00 a.m.
Comparison of estimation methods for net survival probabilities of breast cancer
Anna Johansson, Karolinska Institutet
View abstract
In population-based cancer epidemiology, we often estimate net survival probabilities, which is the probability of surviving the cancer when ignoring other causes of death. Net survival is a useful measure when assessing the effectiveness of cancer control interventions or comparisons across periods and regions. Estimation of net survival is mainly through two approaches: (i) cause-specific survival (using cause of death information) and (ii) relative survival (using expected population mortality). Both approaches rely on assumptions that may or may not be plausible for different cancers. I will describe some challenges with estimating cause-specific and relative survival of breast cancer at different ages, especially for those subtypes with near 100% survival. I will compare both non-parametric estimation as well as model-based estimation using flexible parametric survival models. We have used commands sts list (Kaplan-Meier method), stpp (Pohar-Perme method) and stpm3 (flexible parametric survival models) to estimate these net survival probabilities.
10:30 a.m.
Regression models for accuracy estimation
Niels Henrik Bruun, Aalborg University Hospital
View abstract
I present regression-based methods for estimating and comparing diagnostic accuracy measures while addressing the STARD 2015 requirements. Key metrics include sensitivity, specificity, AUC, PPV, NPV, and accuracy. True-positive and false-positive rates, independent of prevalence, are estimated using OLS regression with robust variance. The derived measures, PPV, NPV, and accuracy, are computed from prevalence, sensitivity, and specificity using nonlinear formulas. For single-modality analysis, sensitivity and specificity are obtained by regressing test outcomes on the "true" values, such as those obtained from pathology. For multimodality studies on the same subjects, data are stacked with a modality indicator, and mixed-effects models with random intercepts are used to account for correlation. A new confreg command combines regression and nonlinear estimation to estimate accuracy metrics under dependency structures. These methods provide a flexible framework for robust comparisons of diagnostic performance across instruments.
11:00 a.m.
Break
11:15 a.m.
wqsreg - a Stata command for weighted quantile sum regression
Marta Ponzano, Department of Life Sciences, Health and Health Professions, Link Campus University, Department of Health Sciences, University of Genoa
Additional authors:
Stefano Renzetti, Department of Health Sceinces University of Genoa
Andrea Bellavia, Department of Environmental Health, Harvard T.H. Chan School of Public Health, TIMI Study Group, Brigham and Women's Hospital, Harvard Medical School
View abstract
Weighted quantile sum (WQS) regression is a statistical method for quantifying the association between a set of possibly correlated predictors and a health outcome, estimating the joint effect of the predictors as well as their individual contributions to the total effect. We present wqsreg, the first Stata command for WQS regression, implemented for continuous, binary, and count outcomes. The execution of the command involves two sequential steps: 1) estimating the weights and constructing the WQS index under specific constraints and 2) modeling its association with the outcome. wqsreg integrates several flexible components of the framework such as bootstrap, training/validation, and repeated holdout procedures; it returns regression estimates as well as graphical displays of the individual weights. wqsreg requires Stata version 11 or higher and is freely available on GitHub. We present an application of the command on exposome data exploring the association between 38 exposures and a continuous outcome while adjusting for a set of covariates. To the best of our knowledge, wqsreg provides the first command to conduct WQS regression in Stata. We anticipate that our contribution will further promote the use of appropriate statistical methods for handling multiple correlated predictors.
11:45 a.m.
Summarizing data from continuous glucose monitors using the cgmstats package
Natalie Daya Malek, Johns Hopkins University
View abstract
The use of wearable CGMs is growing rapidly. The latest generation of CGM systems do not require fingerstick calibration, are minimally invasive, and are frequently used in research studies. CGM sensors are typically worn for up to 2 weeks and record interstitial glucose measurements every minute to every 15 minutes, depending on the sensor used. CGM systems generate hundreds of measurements per day and thousands of measurements in one person over a single wear. There is a need for tools that allow researchers to efficiently organize and summarize the wealth of data on glucose patterns produced by CGM systems. We developed the cgmstats package, which generates CGM summary measures from a variety of CGM systems and allows the user to flexibly define ranges and generate data visualizations. We provide an overview of the cgmstats package and examples of its use. The cgmstats package supports rigorous and reproducible analyses of CGM data.
12:15 p.m.
Lunch
1:15 p.m.
Demographic estimation and projection methods using Stata: Mortality, fertility, and multistate population dynamics
Jerônimo Muniz, Federal Unversity of Minas Gerais
View abstract
Reliable demographic analysis in settings with incomplete or imperfect data requires flexible and transparent estimation and projection tools. This paper presents an integrated suite of Stata-based methods for estimating mortality and fertility and for projecting populations by age, sex, and additional characteristics. First, I revisit intercensal approaches to mortality estimation, including census-based, death distribution, and iterative methods, and introduce tools for constructing single-decrement life tables and estimating age-specific net migration using two population age distributions and intercensal deaths. Second, I describe an enhanced implementation of the own-children method for estimating age-specific fertility rates, providing graphical summaries of recent fertility patterns, weighted subgroup estimates, and a wide range of reproductive indicators derived from biological mother-child links. Third, I present a matrix-based projection framework for forecasting population dynamics under specified schedules of fertility, mortality, and migration, supporting one- and two-sex models as well as multistate classifications such as region, race, or health status. Empirical illustrations draw on census and register data from Vietnam, Brazil, and Sweden, demonstrating applicability across diverse demographic contexts. Together, these methods offer a coherent and extensible toolkit for demographic estimation and projection using standard data sources.
2:00 p.m.
Modeling longitudinal core temperature in a crossover trial of farmworkers in California
Maria Montez Rath, Stanford University
View abstract
Abstract forthcoming
3:00 p.m.
Adjourn
The symposium is conducted in real time and will not be recorded, so all registered users are encouraged to attend. Login information will be sent to registered users on 25 February.