The 2017 Nordic and Baltic Stata Users Group meeting will be held at the Karolinska Institutet in Stockholm on 1 September 2017.
This meeting will provide Stata users the opportunity to exchange ideas, experiences, and information on new applications of Stata. Representatives from StataCorp—David Drukker, Executive Director of Econometrics, and Jeff Pitblado, Executive Director of Statistical Software—will attend, and there will be the usual "Wishes and grumbles" session, at which you may air your thoughts to Stata developers. Everyone who is interested in using Stata is welcome.
Welcome and introduction
stcrmix: A Stata command for fitting mixed competing-risks proportional hazards models
Abstract: I present a new Stata command, stcrmix, that can fit competing-risks models with unobserved heterogeneity, for example, the mixed competing-risks proportional hazard model. I show in particular how to use stcrmix to fit the so-called timing-of-events model. stcrmix closely follows the implementation of the model by Gaure et al. (Journal of Econometrics 2007) and from their crmph R-package. The mixing distribution is approximated by a discrete distribution and the model is fit by the nonparametric maximum-likelihood estimator (NPMLE). For a given number of heterogeneity points, a new set of points that improve the likelihood function is added. Then the likelihood function is maximized with respect to the whole set of parameters. The procedure is repeated until there is no further improvement in the likelihood. I present the model and the estimation method, where I cover the likelihood function and how to find new candidates for heterogeneity points. I then present the syntax of the command. I show how to set up the data to fit the timing-of-events model, and I show an example, based on simulated data, of how to fit the model. Finally, I present results from Monte-Carlo simulations and discuss other uses of the command.
Danish Center for Applied Social Sciences
Instantaneous geometric rates via generalized linear models
Abstract: The instantaneous geometric rate represents the instantaneous probability of an event of interest per unit of time. We propose to model the effect of covariates on the instantaneous geometric rate with two models: the proportional instantaneous geometric-rate model and the proportional instantaneous geometric-odds model. These models can be fit within the generalized linear model framework by using two nonstandard link functions that we implemented in the user-defined link programs log_igr and logit_igr. I illustrate their use through a real-data example.
Andrea Discacciati and Matteo Bottai
Gompertz regression parameterized as accelerated failure time model
Abstract: The only two parametric survival models currently implemented in the streg command in both the metrics of time and hazard are the exponential and Weibull distributions. The Gompertz survival model is parameterized only as a proportional hazard model. The accelerated failure time of the Gompertz distribution is available in the R-package eha (Broström, G. 2014), but not in Stata. I present an accelerated failure-time parametrization of the Gompertz survival model. Parameters are estimated using maximum likelihood. Applications of the model are illustrated using demographic mortality data.
Filip Andersson and Nicola Orsini
Modeling multiple timescales using flexible parametric survival models
Abstract: Time-to-event data are frequently modeled by considering only one main timescale. This may not be optimal for many research questions. When two timescales have been considered, modeling is often limited to including one main timescale and a time-split variable version of the second timescale. Unfortunately, this can be computationally intensive. Another less optimal solution is to include a time-fixed version of the second timescale, which does not sufficiently capture the trend of interest.
Because time increases at the same rate, every timescale can be written as a function of others. For example, attained age from a diagnosis of a disease is equal to the time from the diagnosis plus the age at diagnosis. Likelihood functions of standard time-to-event models cannot be written analytically when the model includes multiple timescales as functions of each other. However, we have developed an approach to model the log hazard using flexible parametric survival models, employing numerical integration to obtain the likelihood function under an arbitrary number of timescales. Thus, we present a new Stata command that offers the possibility to model multiple timescales simultaneously using flexible parametric survival models on the log hazard scale.
Therese M-L Andersson
Michael J. Crowther
University of Leicester
Paul C. Lambert
University of Leicester
Causal inference with sample selection
Abstract: I discuss how to use the new extended regression model (ERM) commands to estimate average causal effects when the outcome is censored or when the sample is endogenously selected. I also discuss how to use these commands to estimate causal effects in the presence of endogenous explanatory variables, which these commands also accommodate.
A journey to latent class analysis (LCA)
Abstract: Stata's estimation commands have evolved in how they account for groups in the sample. Since the early days of Stata, fitting models with group-specific parameters is simply a matter of using the if clause to condition on group membership. Inference between group-specific parameters was made possible with the introduction of suest in Stata 8. In Stata 12, we introduced sem and group analysis for structural equation models (SEMs). Stata 15 introduces two kinds of group analysis for generalized SEMs. For observed groups, gsem has the new group() option. For latent groups, gsem has the lclass() option and the ability to perform LCA.
Intervention time-series models using transfer functions
Abstract: The evaluation of the impact of policies on the population's health has become a major commitment for states and communities. The intervention (or interrupted) time-series design is the strongest and most commonly used quasi-experimental design to assess the impacts of health interventions in which the standard randomized trials are not feasible. The recent user-written command itsa and its related postestimation commands (Stata Journal 15–2, Stata Journal 17–1) greatly facilitate testing shifts in level and slope—after intervention using linear regression models—with an adjustment of the standard errors for the correlation of the repeated measures over time (newey, prais). A more advanced approach is ARIMA models with transfer functions, proposed by Box and Tiao (JASA, 1975). Although transfer function models have been successfully used in several research areas, Stata does not have a command specially designed for them. In this presenation, we will explain how to fit these types of models in Stata. We will also discuss applications of the method.
XingWu Zhou and Nicola Orsini
The new qcm command for nonlinear quantile coefficient models
Abstract: We present qcm, a new command for fitting nonlinear quantile coefficient models. These are parametric models for the conditional quantile function of an outcome variable given covariates. The parameters are defined as functions of the order of the quantile. We briefly introduce the method and illustrate the use of qcm through an example of the estimation of percentiles of respiratory function in healthy children.
Matteo Bottai and Nicola Orsini
One-stage dose–response meta-analysis
Abstract: Synthesis of linear and nonlinear exposure-disease associations based on summarized data is often limited to epidemiological studies reporting more than two nonreferent categories. Being able to specify a model on the combined data rather than within each study would allow inclusion of all the available information regardless of how the exposure was initially categorized. Within the general framework of a linear mixed-effect model, the aim of this presentation is to show how to specify a one-stage dose–response model suitable for this type of data. Estimation based on likelihood and restricted maximum likelihood is implemented in a new command. Simulated data and real examples will be used to illustrate the advantages offered by the proposed approach.
Nicola Orsini and Alessio Crippa
Text analytics using WordStat 7 within Stata
Abstract: WordStat for Stata offers advanced text analytics features, allowing Stata 13, 14, and 15 users to analyze text stored in both short- and long-string variables using numerous text-mining features, such as topic modeling, document clustering, automatic classification, GIS mapping, and state-of-the-art dictionary-based content analysis. Extracted themes may then be related to structured data using various statistics and graphic displays. WordStat also offers a tool to create a Stata project from lists of documents (including .DOC, HTML, and PDF files) and to automatically extract numerical data, categorical data, and dates from them.
The use of Stata in medical statistics and epidemiology: A long journey
Abstract: In my talk, I will review how Stata has facilitated teaching epidemiology and biostatistics in many Master and PhD programs. Many procedures such as the one available in epitab elegantly describe simple and adjusted estimation and testing in both cohort and case-control studies. The lexis macro has turned into stsplit, a powerful procedure. The correspondence between the underlying methods and simple application in Stata is a unique feature of the software. User contributions and interactions have been valuable for the development of the software.
University of Milano–Bicocca and Karolinska Institutet
Wishes and grumbles
To register for the meeting, please email your name, affiliation, and contact details to Metrika Consulting.
The meeting will be held at the Karolinska Institutet in Stockholm, Sweden.
Nobels väg 10, Solna
The meeting is free, but registration is required.
Visit the official meeting page to register or to see more information.
University of Leicester & Karolinska Institutet
Biostatistics Team, Department of Public Health Sciences, Karolinska Institutet
The logistics organizer for the 2017 Nordic and Baltic Stata Users Group meeting is Metrika Consulting, the distributor of Stata in the Nordic countries and Baltic states—Norway, Denmark, Finland, Sweden, Iceland, Estonia, Latvia, and Lithuania.
For more information on the 2017 Stata Users Group meeting, visit the official meeting page.
View the proceedings of previous Stata Users Group meetings.