Home  /  Stata Conferences  /  2025 Northern Europe

The 16th Northern European Stata Conference will take place on 29 August at the Karolinska Institutet.

This conference will provide Stata users with the opportunity to exchange ideas, experiences, and information on new applications of Stata. Representatives from StataCorp will attend and host an open panel discussion, so you can share your questions and feedback directly with Stata developers. Anyone interested in using Stata is welcome. No level of expertise is assumed for presenters or attendees.


Program

All times are in CEST (UTC +2)

Friday, 29 August

8:30–9:00 Registration
9:00–9:30 wqsreg: A Stata command for weighted quantile sum regression Abstract:
(Read more)
Weighted quantile sum (WQS) regression is a flexible statistical method for quantifying the association between a set of possibly correlated predictors and a health outcome. This approach is gaining substantial popularity in several fields such as environmental epidemiology, where it allows estimating the overall effects of complex environmental mixtures as well as the specific contributions of each mixture component. A Stata command for fitting this increasingly popular procedure, however, has not been developed yet. To address this gap, we have developed a new command, wqsreg, that enables users to fit WQS regression models for continuous outcomes while allowing for the several flexible components of this framework, including adjust for potential confounders; estimating both positive and negative overall mixture effects; providing robust weight estimates through bootstrap; specify the method used to rank variables included in the mixture (for example, quartiles); provide iteration limits to be performed before optimization; and fix the seed and customize save options. wqsreg returns the estimates from WQS regression, plots the estimated weights, and creates a dataset containing the WQS index for each subject. In this talk, we will introduce the key features of WQS regression, describe wqsreg, and demonstrate its use through examples. Given the increasing importance of appropriately exploring complex multidimensional exposures such as environmental mixtures, this command provides Stata users with one of the first commands to apply a modern computational approach specifically developed for these settings.

Contributors:
Stefano Renzetti
Università degli Studi di Parma
Andrea Bellavia
Harvard T.H. Chan School of Public Health
(Read less)

Marta Ponzano
Università di Genova
9:30–10:00 Fitting joinpoint models for descriptive analysis of cancer trends in Stata Abstract:
(Read more)
Investigation of temporal trends of cancer incidence and mortality rates is often performed visually with interest in changes in the gradient of increases or decreases in the rates. Joinpoint models are used to help quantify the trends, using linear splines where both the number and location of the knots (joinpoints) are selected as part of the modeling process. I will describe a Stata implementation of joinpoint models and introduce the joinpoint command and associated postestimation commands. The approach can be computer intensive because all possible combinations of the number and the location of the knots are fit when selecting the models. I will describe how use of Mata to fit the models leads to dramatic speed improvements. The joinpoint command has various options, for example, choosing different model-selection criteria and choosing the maximum number of knots and the minimum number of data points between knots. Output options include estimation of the annual percent change (APC), with two different methods to calculate confidence intervals. There is a postestimation predict command and a command to provide visual summaries of the fitted model.

(Read less)

Paul C Lambert
Cancer Registry of Norway and Karolinska Institutet
10:00–10:30 Stata 20 will have correct inference on random effects Abstract:
(Read more)
Mixed models, and random effects in particular, are used routinely to model data with dependent observations and effect heterogeneity. However, while random effects are convenient for specifying a model, they often complicate inference. As a result, popular software for statistical analysis often does not provide confidence intervals for random effect parameters by default, or worse, provides provably unreliable ones. This talk discusses the challenges and possible solutions.

(Read less)

Matteo Bottai
Karolinska Institutet
10:30–11:00 Break
11:00–12:00 Modeling interval-censored event-time data with Stata Abstract:
(Read more)
Do you have event-time data that you would like to model but are unsure exactly when the events occurred? In survival analysis, interval-censored event-time data arise when the event of interest is not observed precisely but is known to have occurred within a specific time interval. Stata 17 introduced the stintcox command to fit genuine semiparametric Cox models for such data, and Stata 18 expanded its capabilities by adding support for time-varying covariates (TVCs). Building on this, Stata 19 introduces the new stmgintcox command, enabling the modeling of interval-censored multiple-event data while accounting for potential correlations between event times across different event types. In this presentation, I will describe the fundamental types of interval-censored data and demonstrate how to fit the semiparametric Cox proportional hazards model using the stintcox command. I will provide examples using single-record and multiple-record-per-subject datasets and show how to incorporate TVCs. Additionally, I will discuss how to interpret and plot results, and how to assess the proportional hazards assumption. Finally, I will show you how to fit a marginal Cox proportional hazards model to interval-censored multiple-event data and perform a more powerful test for common covariate effects across all events.

(Read less)

Xiao Yang
StataCorp
12:00–1:00 Lunch
1:00–1:30 Prediction intervals in meta-analysis: A clearer view of heterogeneity and expected future findings using Stata Abstract:
(Read more)
Meta-analyses in epidemiology often rely on 95% confidence intervals (CIs) to summarize the precision of pooled estimates. However, CIs are frequently misinterpreted and offer limited insight into how study results vary (heterogeneity) or what future studies might show. Prediction intervals (PIs), by contrast, directly reflect such between-study variability and estimate the range within which the true effect of a future study is expected to fall—providing a more interpretable and policy-relevant view of uncertainty. This talk presents the rationale for using PIs in meta-analyses of odds ratios (ORs), drawing on the methods described in Borenstein’s widely used text on the subject. PIs will be contrasted with traditional heterogeneity measures like I2, which is often misused or overinterpreted as a precise index of inconsistency. In addition, PIs allow framing heterogeneity in terms of expected future effects and provides a more intuitive and decision-relevant perspective. Using Stata, I will demonstrate how to compute and visualize PIs, including enhanced graphical methods based on probability density functions (PDFs). Such plots go beyond Stata’s whiskerlike default PI displays in forest plots by better illustrating both the expected range and the relative likelihood of future effect sizes—conveying direction, dispersion, and uncertainty in a single visual. Attendees will gain a practical and conceptual understanding of how PIs can complement or even surpass CIs and I2 as tools for interpreting and applying meta-analytic evidence in epidemiology.

(Read less)

David J. Miller
U.S. Environmental Protection Agency (retired)
1:30–2:00 Supplementing risk ratios in sibling analysis: Estimating clinically useful measures from family-based analysis Abstract:
(Read more)
Family-based designs, like sibling comparisons, are powerful tools for addressing confounding, but they often rely solely on relative measures such as odds ratios or hazard ratios—limiting their interpretability for clinical and policy decision-making. In this talk, I introduce the marginalized between-within framework, a method that enhances family-based analyses by enabling the estimation of absolute risks and other clinically meaningful metrics. I will begin with an overview of sibling comparison methods and the rationale behind decomposing effects into within- and between-family components. Then, using Swedish registry data, I’ll demonstrate how this framework can be applied to assess the impact of maternal smoking on infant mortality. The model allows us to estimate absolute risk differences, average treatment effects, attributable fractions, and numbers needed to harm—metrics that are often more useful than relative estimates. Compared with traditional conditional logistic or stratified Cox regression models, the marginalized between-within approach offers similar relative estimates but adds the crucial ability to anchor results to a global baseline, making absolute measures possible. These measures provide clearer insights for public health and policy interventions.

(Read less)

Hugo Sjöqvist
Karolinska Institutet
2:00–2:30 Imputing right-skewed bounded biomarkers in partially measured cohorts Abstract:
(Read more)
In large medical and epidemiological studies, important biomarkers are often available only for a limited fraction of participants because of the high laboratory costs or feasibility constraints. This results in a high proportion of missing values. Imputation strategies can be employed to prevent the loss of information. However, imputing biomarker values is challenging because of the right-skewed and naturally bounded values of biomarker distributions. In this talk, I compare two imputation strategies that can handle such challenges: a likelihood-based approach and logistic quantile imputation implemented in Stata. I evaluate the performance of both methods through simulation, assessing bias and inferential errors. The approaches are illustrated with a practical example of recently discovered blood biomarkers in Alzheimer’s research. The results provide some insight on recovering biomarker distributions when outcome data are fully observed but biomarkers are only partially measured.

Contributor:
Robert Thiesmeier
Karolinska Institutet
(Read less)

Nicola Orsini
Karolinska Institutet
2:30–3:00 Break
3:00–4:00 Linking frames in Stata Abstract:
(Read more)
This presentation gives an overview of data frames in Stata. I demonstrate the basics of working with multiple datasets in Stata. I cover most of the frames suite of commands, touching on frame creation and management, linking frames, copying variables from linked frames, alias variables, and working with a set of frames.

(Read less)

Jeff Pitblado
StataCorp
4:00–5:00 Open panel discussion with Stata developers
Contribute to the Stata community by sharing your feedback with StataCorp's developers. From feature improvements to bug fixes and new ways to analyze data, we want to hear how Stata can be made better for our users.

Scientific committee

Matteo Bottai (Chair)
Karolinska Institutet
Paul Lambert
Karolinska Insititutet
Nicola Orsini
Karolinska Insititutet

Registration and venue

The conference is free, but registration is required. All participants are responsible for their own travel and accommodation expenses.

To register for the conference, please email your name, affiliation, and contact details.

Registration deadline is 28 August 2025.

Register

Visit the official conference page for more information.


Logistics organizer

The 2025 Northern European Stata Conference is jointly organized by Metrika Consulting AB, the official distributor of Stata for Russia and the Nordic and Baltic countries, and the Division of Biostatistics at Karolinska Institutet.

View the proceedings of previous Stata Conferences and international meetings.