Home  /  Users Group meetings  /  2016 Norway

The 2016 Nordic and Baltic Stata Users Group meeting was September 13, but you can still interact with the user community even after the meeting and learn more about the presentations shared.


Processing work-history data to quantify occupational exposure using Stata
Abstract: Work-history data can be linked to job-exposure matrices (JEMs), containing job-specific exposure ratings across different time periods, for retrospective assessment of job-specific exposure. However, work-history data often present major challenges, including mapping job titles into standardized occupational codes, quality and consistency checks, and complexity arising from overlapping employment spells. We demonstrate how Stata can be used to resolve some of these challenges, specifically, complex spell structure arising because of overlapping employment periods and gaps.
Additional information
Ronnie Babigumira
Cancer Registry of Norway
Jo S. Stenehjem
Cancer Registry of Norway
Tom K. Grimsrud
Cancer Registry of Norway
Clinical database management: From raw data through study tabulations to analysis datasets
Abstract: There are about as many ways to compile datasets for analyses as there are statisticians or epidemiologists. Some like to have wide datasets with one observation per subject and tons of variables, while others like the long format with many observations per subject. I will present the Raw/Tabulation Datasets/Analyses Datasets (Raw/TD/AD) method inspired by the standards for clinical datasets set by the Clinical Data Interchange Standards Consortium (CDISC). The CDISC standards are widely adopted by the pharmaceutical industry but less so within academia probably because of the rigidity of the standards. While the standards, specifically the Study Data Tabulation Model (SDTM) and the Analysis Data Model (AdAM), are probably to extensive for academic researchers, elements of the standards could inspire for a more rigorous setup of clinical databases. I will present the basic structure of first compiling raw study data into standard datasets such as study visits, demographics, vital signs, etc., and then compiling analyses datasets introducing derived variables, imputations, and other formatting to form datasets ready for analyses. I will provide examples from a recently finished randomized controlled trial.
Additional information
Inge Christoffer Olsen
Diakonhjemmet Hospital
Moving from SAS to Stata, making customized tables in RTF using rtfutil and other packages
Abstract: When one moves from SAS to Stata, a major drawback is the ability to produce customized tables in MS Word, which is available in SAS using the Output Delivery System (ODS) destination for RTF. Most medical articles are prepared for submission in Word, and it is preferable to produce ready-to-use tables without the need for error-prone cutting and pasting from the Results window. I will show how this can be done in Stata with user-written packages such as parmest, xcontract, and xcollapse for making datasets of results (often denoted resultssets) and then with listtab and the rtfutil package for producing the RTF tables. The rtfutil package can also be used to include graphics in the RTF file, enabling the production of study reports and tables, listings and figures (TLFs). I will provide examples from a recent randomized controlled trial.
Additional information
Inge Christoffer Olsen
Diakonhjemmet Hospital
Creating LaTeX and HTML documents from within Stata using texdoc and webdoc
Abstract: At the 2009 meeting in Bonn, I presented a new Stata command called texdoc. The command allowed weaving Stata code into a LaTeX document, but its functionality and its usefulness for larger projects were limited. In the meantime, I heavily revised the texdoc command to simplify the workflow and improve support for complex documents. The command is now well suited, for example, to generate automatic documentation of data analyses or even to write an entire book. In this talk, I will present the new features of texdoc and provide examples of their application. Furthermore, I will present a newly released companion command called webdoc that can be used to produce HTML or Markdown documents.
Ben Jann
University of Bern
Estimating treatment effects from observational data using teffects, stteffects, and eteffects
Abstract: This talk reviews treatment-effect estimation with observational data and discusses Stata examples that illustrate syntax and parameter interpretation. After reviewing the potential-outcome framework, the talk discusses estimators for the average treatment effect (ATE) that require exogenous treatment assignment and some estimators that allow for endogenous treatment assignment. The talk also discusses checks for balance, checks for overlap, and some estimators for the ATE from survival-time data. Finally, the talk discusses estimating and interpreting quantile treatments effects.
Additional information
David M. Drukker
Multistate survival analysis in Stata
Abstract: Multistate models are increasingly being used to model complex disease profiles. By modeling transitions between disease states, accounting for competing events at each transition, we can gain a much richer understanding of patient trajectories and how risk factors impact over the entire disease pathway. We will introduce some new Stata commands for the analysis of multistate survival data. This includes msset, a data preparation tool that converts a dataset from wide (one observation per subject, multiple time and status variables) to long (one observation for each transition for which a subject is at risk for). We develop a new estimation command, stms, that allows the user to fit different parametric distributions for different transitions, simultaneously, while allowing sharing of covariate effects across transitions. Finally, we present predictms, which calculates transitions probabilities and many other useful measures of absolute risk, following the fit of any model using streg, stms, or stcox, using either a simulation approach or the Aalen–Johansen estimator. We illustrate the software using a dataset of patients with primary breast cancer.
Additional information
Michael J. Crowther
University of Leicester & Karolinska Institutet
Paul C. Lambert
University of Leicester & Karolinska Institutet
Joint modeling of longitudinal and survival data
Abstract: Joint modeling of longitudinal and survival-time data has been gaining more and more attention in recent years. Many studies collect both longitudinal and survival-time data. Longitudinal, panel, or repeated-measures data record data measured repeatedly at different time points. Survival-time or event history data record times to an event of interest such as death or onset of a disease. The longitudinal and survival-time outcomes are often related and should thus be analyzed jointly. Three types of joint analysis may be considered: 1) evaluation of the effects of time-dependent covariates on the survival time; 2) adjustment for informative dropout in the analysis of longitudinal data; and 3) joint assessment of the effects of baseline covariates on the two types of outcomes. In this presentation, I will provide a brief introduction to the methodology and demonstrate how to perform these three types of joint analysis in Stata.
Additional information
Yulia Marchenko
2:30–3:00 Coffee break
Creating efficient designs for discrete choice experiments
Abstract: Over the past decades, the discrete choice experiment (DCE) has become a popular tool for investigating individual preferences in several fields. This talk will describe the dcreate command, which creates efficient designs for DCEs using the modified Fedorov algorithm. The algorithm maximizes the D-efficiency of the design based on the covariance matrix of the conditional logit model.
Additional information
Arne Risa Hole
University of Sheffield
Using Monte Carlo simulation for nonstandard sample size/power calculation
Abstract: Prospective sample-size calculation is an important aspect of study design, as is retrospective power calculation, particularly when statistical significance is not achieved. For comparatively simple hypothesis tests applied to simple experimental designs, these quantities can be calculated using closed-form analytic expressions. However, as designs and models become more complicated, the derivation of power functions becomes difficult, and simulation is often used when analytic approaches become intractable. This talk will illustrate the use of Stata's simulation capabilities to calculate statistical power for hypothesis tests based on arbitrarily complex statistical models. Once a model is specified as an alternative hypothesis, simulation is typically straightforward, and Stata's ability to capture and accumulate model parameters enables straightforward calculation of statistical power.
Additional information
Mike Jones
Macquarie University
The case–cohort design: What it is and how it can be used in register-based research
Abstract: This presentation will give a brief theoretical background and history of case–cohort studies, which date back to the key publication by Prentice in 1986. Examples of situations when the case–cohort design is useful will be given, in particular, in a register-based setting with total population registers. The case–cohort design will be compared with the nested case–control design, and advantages and disadvantages will be presented. From a case–cohort design, it is possible to estimate the same measures of effects (for example, hazards, hazard ratios, hazard differences) that can be estimated in a standard cohort study, provided that weights are included to account for the oversampling of cases. Hence, in practice, the analysis of a case–cohort study is similar to that of a cohort study (for example, Cox regression, Poisson regression, and flexible parametric models), with the addition of proper weights. Stata code for how to sample a case–cohort study from a cohort study and how to incorporate weights into the analysis will be presented. As an example, I will present a study on the risk for breast cancer following pregnancy using data from the Swedish Multi-Generation Register and the Swedish Cancer Register. In this study, I utilized the case–cohort design to reduce the analytical dataset and to improve computational efficiency.
Additional information
Anna Johansson
Karolinska Institutet

Training course on flexible parametric survival models

Immediately following the meeting on September 14, Paul Lambert gave a one-day course on flexible parametric survival models. Professor Lambert is coauthor of the Stata program stpm2 and the book Flexible Parametric Survival Analysis Using Stata: Beyond the Cox Model.


Scientific committee

Tor Åge Myklebust (Coordinator)
The Cancer Registry of Norway, Institute of Population-based Cancer Research

Arne Risa Hole
University of Sheffield

Hein Stigum
Norwegian Institute of Public Health

Morten Wang Fagerland
Oslo Centre for Biostatistics and Epidemiology (OCBE)

Peter Hedström
Institute of Analytical Sociology, Linköping University

Committee email: [email protected]

Meeting coordinator

Bjarte Aagnes
The Cancer Registry of Norway, Institute of Population-based Cancer Research

Logistics organizer

The Stata User Group meeting is jointly organized by The Cancer Registry of Norway—Institute of Population-based Cancer Research and Metrika Consulting. Metrika Consulting is the distributor of Stata in the Nordic countries and Baltic states—Norway, Denmark, Finland, Sweden, Iceland, Estonia, Latvia, and Lithuania.

View the proceedings of previous Stata Users Group meetings.