The 2016 UK Stata Users Group meeting was September 8–9, but you can still interact with the user community even after the meeting and learn more about the presentations shared.
Proceedings
Thursday, September 8
9:15–9:45 
Abstract:
The Rubin method of confounder adjustment, in its
21stcentury version, is a twophase method for using
observational data to estimate a causal treatment effect
on an outcome variable. It involves first finding a
propensity model in the joint distribution of a
treatment variable and its confounders (the design
phase), and then estimating the treatment effect from
the conditional distribution of the outcome,
given the treatments and confounders (the
analysis phase). In the design phase, we want to limit
the level of spurious treatment effect that might be
caused by any residual imbalance between treatment and
confounders that may remain, after adjusting for the
propensity score by propensity matching and weighting
and/or stratification.
A good measure of this is Somers's D(WX), where W is a confounder or a propensity score and X is the treatment variable. The SSC package somersd calculates Somers's D for a wide range of sampling schemes, allowing matching and weighting and restriction to comparisons within strata. Somers's D has the feature that if Y is an outcome, then a highermagnitude D(YX) cannot be secondary to a lowermagnitude D(WX), implying that D(WX) can be used to set an upper bound to the size of a spurious treatment effect on an outcome. For a binary treatment variable X, D(WX) gives an upper bound to the size of a difference between the proportions, in the two treatment groups, that can be caused for a binary outcome. If D(WX) is less than 0.5, then it can be doubled to give an upper bound to the size of a difference between the means, in the two treatment groups, that can be caused for an equalvariance normal outcome, expressed in units of the common standard deviation for the two treatment groups. We illustrate this method using a familiar dataset, with examples using propensity matching, weighting, and stratification. We use the SSC package haif in the design phase to check for variance inflation caused by propensity adjustment and use the SSC package scenttest (an addition to the punaf family) to estimate the treatment effect in the analysis phase. Roger B. Newson
Imperial College London

9:45–10:15 
Abstract:
Multistate models are increasingly being used to model
complex disease profiles. By modeling transitions
between disease states, accounting for competing events
at each transition, we can gain a much richer
understanding of patient trajectories and how risk
factors impact over the entire disease pathway. In this
talk, we will introduce some new Stata commands for the
analysis of multistate survival data. This includes
msset, a data preparation tool that converts a
dataset from wide (one observation per subject, multiple
time and status variables) to long (one observation for
each transition for which a subject is at risk for). We
develop a new estimation command, stms, that
allows the user to fit different parametric distributions
for different transitions, simultaneously, while
allowing for sharing of covariate effects across
transitions. Finally, predictms calculates
transition probabilities, and many other useful measures
of absolute risk, following the fit of any model using
streg, stms, or stcox, using either a
simulation approach or the Aalen–Johansen estimator. We
illustrate the software using a dataset of patients with
primary breast cancer.
Michael J. Crowther
University of Leicester and Karolinska Institutet
Paul C. Lambert
University of Leicester and Karolinska Institutet

10:15–10:45 
Abstract:
Quantile plots show ordered values (raw data, estimates,
residuals, whatever) against rank or cumulative
probability or a onetoone function of the same. Even
in a strict sense, they are almost 200 years old. In
Stata, quantile, qqplot, and qnorm go
back to 1985 and 1986. So why any fuss?
The presentation is built on a longconsidered view that quantile plots are the best single plot for univariate distributions. No other kind of plot shows so many features so well across a range of sample sizes with so few arbitrary decisions. Both official and userwritten programs appear in a review that includes sidebyside and superimposed comparisons of quantiles for different groups and comparable variables. Emphasis is on newer, previously unpublished work, with focus on the compatibility of quantiles with transformations; fitting and testing of brandname distributions; quantilebox plots as proposed by Emanuel Parzen (1929–2016); equivalents for ordinal categorical data; and the question of which graphics best support paired and twosample t and other tests. Commands mentioned include distplot, multqplot, and qplot (Stata Journal) and mylabels, stripplot, and hdquantile (SSC). References: Cox, N.J. 1999a. Distribution function plots. Stata Technical Bulletin 51: 12–16. Updates Stata Journal 32, 34, 53, 101. 1999b. Quantile plots, generalized. Stata Technical Bulletin 51: 16–18. Updates Stata Technical Bulletin 61; Stata Journal 41, 53, 64, 104, 121. 2005. The protean quantile plot. Stata Journal 5: 442–460. 2007. Quantile–quantile plots without programming. Stata Journal 7: 275–279. 2012. Axis practice, or what goes where on a graph. Stata Journal 12: 549–561. Nicholas J. Cox
Durham University

11:15–11:45 
Abstract:
At the 2009 meeting in Bonn, I presented a new Stata
command called texdoc. The command allowed weaving
Stata code into a LaTeX document, but its functionality
and its usefulness for larger projects were limited. In
the meantime, I heavily revised the texdoc command
to simplify the workflow and improve support for complex
documents. The command is now well suited, for example,
to generate automatic documentation of data analyses or
even to write an entire book. In this talk, I will
present the new features of texdoc and provide
examples of their application.
Ben Jann
University of Bern

11:45–12:15 
Abstract:
In many fields of statistics, summary tables are used to
describe characteristics within a study population.
Moreover, such tables are often used to compare
characteristics of two or more groups, for example,
treatment groups in a clinical trial or different
cohorts in an observational study. This talk introduces
the sumtable command, a userwritten command that
can be used to produce such summary tables, allowing for
different summary measures within one table. Summary
measures available include means and standard
deviations, medians and interquartile ranges, and numbers
and percentages. The command removes any manual
aspect of creating these tables (for example, copying and
pasting from the Stata output window) and therefore
eliminates transposition errors. It also makes creating
a summary table quick and easy and is especially useful
if data are updated and tables subsequently need to
change. The end result is an Excel spreadsheet that can
be easily manipulated for reports or other documents.
Although this command was written in the context of
medical statistics, it would be equally useful in many
other settings.
Lauren J. Scott
Clinical Trials and Evaluation Unit, Bristol
Chris A. Rogers
Clinical Trials and Evaluation Unit, Bristol

12:15–12:45 
Abstract:
One of the main reasons for the popularity of panel data
is that they make it possible to account for the
presence of timeinvariant unobserved individual
characteristics, the socalled fixed effects. Consistent
estimation of the fixed effects is only possible if the
number of time periods is allowed to pass to infinity, a
condition that is often unreasonable in practice.
However, in a small number of cases, it is possible to
find methods that allow consistent estimation of the
remaining parameters of the model, even when the number
of time periods is fixed. These methods are based on
transformations of the problem that effectively
eliminate the fixed effects from the model.
A drawback of these estimators is that they do not provide consistent estimates of the fixed effects, and this limits the kind of inference that can be performed. For example, in linear models, it is not possible to use the estimates obtained in this way to make predictions of the variate of interest. This problem is particularly acute in nonlinear models, where often the parameters have little meaning, and it is more interesting to evaluate partial effects on quantities of interest. In this presentation, we show that although it is indeed generally impossible to evaluate the partial effects at points of interest, it is sometimes possible to consistently estimate quantities that are informative and easy to interpret. The problem will be discussed using Stata, centered on a new adofile for calculating the average logit elasticities. Gordon Kemp
University of Essex
João M.C. Santos Silva
University of Surrey

1:45–2:45 
Abstract:
Doctors and consultants want to know the effect of a
covariate for a given covariate pattern. Policy analysts
want to know a populationlevel effect of a covariate. I
discuss how to estimate and interpret these effects
using factor variables and margins.
David M. Drukker
StataCorp

2:45–3:15 
Abstract:
We model the time series of credit default swap (CDS)
spreads on sovereign debt in the Eurozone, allowing for
stochastic volatility and examining the effects of
countryspecific and systemic shocks. A weekly
volatility series is produced from daily quotations on
11 Eurozone countries: CDS for 2009–2010. Using
Stata's gmm command, we construct a highly
nonlinear model of the evolution of realized volatility
when subjected to both idiosyncratic and systemic
shocks. Evaluation of the quality of the fit for the 24
moment conditions is produced by a Mata auxiliary
routine. This model captures many of the features of
these financial markets during a turbulent period in the
recent history of the single currency. We find that
systemic volatility shocks increase returns on
"virtuous" borrowers' CDS while reducing returns for
the most troubled countries' obligations.
Christopher F. Baum
Boston College and DIW Berlin
Paola Zerilli
University of York

3:15–3:30 
Abstract:
This presentation introduces a new Stata command,
xtdcce, to estimate a dynamic common correlated effects
model with heterogeneous coefficients. The estimation
procedure mainly follows Chudik and Pesaran (2015); in
addition, the common correlated effects estimator
(Pesaran 2006) as well as the mean group (Pesaran and
Smith 1995) and the pooled mean group estimator (Shin,
Pearson, and Smith 1999) are supported. Coefficients are
allowed to be heterogeneous or homogeneous. In addition,
instrumental variable regressions and unbalanced panels
are supported. The crosssectional dependence test (CD
test) is automatically calculated and presented in the
estimation output. Examples for empirical applications
of all estimation methods mentioned above are given.
References: Chudik, A., and M. H. Pesaran. 2015. Large panel data models with crosssectional dependence: A survey. In The Oxford Handbook of Panel Data, ed. B. H. Baltagi, 3–45. New York: Oxford University Press. Pesaran, M. 2006. Estimation and inference in large heterogeneous panels with a multifactor error structure. Econometrica 74: 967–1012. Pesaran, M.H., and R. Smith. 1995. Estimating longrun relationships from dynamic heterogeneous panels. Journal of Econometrics 68: 79–113. Shin, Y., M. H. Pesaran M.H., and R. P. Smith. 1999. Pooled mean group estimation of dynamic heterogeneous panels. Journal of the American Statistical Association 94: 621–634. Jan Ditzen
Spatial Economics and Econometrics Centre, HeriotWatt University, Edinburgh

4:00–4:30 
Abstract:
Linear mixedeffects models are commonly used for the
analysis of longitudinal biomarkers of disease. Taylor,
Cumberland, and Sy/In (1994) proposed modeling biomarkers with a
linear mixedeffects model with an added integrated
Ornstein–Uhlenbeck (IOU) process (linear mixedeffects
IOU model). This allows for autocorrelation, changing
withinsubject variance, and the incorporation of
derivative tracking, that is, how much a subject tends
to maintain the same trajectory for extended periods of
time. Taylor, Cumberland, and Sy argued that the covariance
structure induced by the stochastic process in this
model was interpretable and more biologically plausible
than the standard linear mixedeffects model. However,
their model is rarely used, partly because of the lack of
available software. We present a new Stata command,
xtiou, that fits the linear mixedeffects IOU model
and its special case, the linear mixedeffects Brownian
motion model. The model can be fit to balanced and
unbalanced data, using restricted maximumlikelihood
estimation, where the optimization algorithm is either
the Newton–Raphson, Fisher scoring, or average
information algorithm, or any combination of these. To
aid convergence, the command allows the user to change
the method for deriving the starting values for
optimization, the optimization algorithm, and the
parameterization of the IOU process. We also provide a
predict command to generate predictions under the
model. We illustrate xtiou and predict with
an example of repeated biomarker measurements from
HIVpositive patients.
Reference: Taylor, J., W. Cumberland, and J. Sy. 1994. A stochastic model for analysis of longitudinal AIDS data. Journal of the American Statistical Association 89: 727–736. Rachael A. Hughes
University of Bristol
Michael G. Kenward
Luton
Jonathan A.C. Sterne
University of Bristol
Kate Tilling
University of Bristol

4:30–5:00 
Abstract:
Stata and Mata are very powerful and flexible for data
processing and analysis, but there are some problems
that can be fixed faster or more easily by using a
lowerlevel programming language. statacpp is a
command that allows users to write a C++ program,
have Stata add your data, matrices, or globals into it,
compile it to an executable program, run it, and return
the results back into Stata as more variables, matrices,
or globals in a dofile. The most important use cases
are likely to be around big data and MapReduce (where
data can be filtered and processed according to
parameters from Stata and reduced results passed into
Stata) and machine learning (where existing powerful
libraries such as TensorFlow can be utilised). Short
examples will be shown of both these aspects. Future
directions for development will also be outlined, in
particular calling Stata from C++ (useful for realtime
responsive analysis) and calling CUDA from Stata (useful
for massively parallel processing on GPU chips).
Work in progress at https://github.com/robertgrant/statacpp Robert L. Grant
Kingston and St George's, London

5:00–5:30 
Abstract:
Attrition is one potential bias that occurs in
longitudinal studies when participants drop out and is
informative when the reason for attrition is associated
with the study outcome. However, this is impossible to
check because the data we need to confirm informative
attrition are missing. When data are missing at random
(MAR), the probability of missingness not being
associated with the missing values conditional on the
observed data, one appropriate approach for handling
missing data is multiple imputation (MI). However, when
attrition results in the data being missing not at
random (MNAR), the probability of missing data is
associated with the values missing, so we cannot use MI
directly. An alternative approach is pattern mixture
modeling, which specifies the distribution of the
observed data, which we know, and the missing data,
which we dont know. We can estimate the missing data
models, using observations about the data, and average
the estimates of the two models using MI. Many
longitudinal clinical trials have a monotone missing
pattern (once participants drop out, they do not return),
which simplifies MI, so use pattern mixture modeling as
a sensitivity analysis. However, in observational
studies, data are missing because of nonresponses and
attrition, which is a more complex setting for handling
attrition compared with clinical trials.
For this study, we used data from the Whitehall II study. Data were first collected on over 10,000 civil servants in 1985 and data collection phases are repeated every 23 years. Participants complete a health and lifestyle questionnaire and, at alternate , oddnumbered phases, attend a screening clinic. Over 30 years, many epidemiological studies used these data. One study investigated how smoking status at baseline (Phase 5) was associated with a 10year cognitive decline using a mixed model with random intercept and slope. In these analyses, the authors replaced missing values in nonresponders with last observed values. However, participants with reduced cognitive function may be unable to continue participation in the Whitehall II study, which may bias the statistical analysis. Using Stata, we will simulate 1,000 datasets with the same distributions and associations as Whitehall II to perform the statistical analysis described above. First, we will develop a MAR missingness mechanism (conditional on previously observed values) and change cognitive function values to missing. Next, for attrition, we will use a MNAR missingness mechanism (conditional on measurements at the same phase). For both MAR and MNAR missingness mechanisms, we will compare the bias and precision from an analysis of simulated datasets without any missing data with a complete case analysis and an analysis of data imputed using MI; additionally, for the MNAR missingness mechanism, we will use pattern mixture modeling. We will use the twofold fully conditional specification (FCS) algorithm to impute missing values for nonresponders and to average estimates when using pattern mixture modeling. The twofold FCS algorithm imputes each phase sequentially conditional on observed information at adjacent phases, so is a suitable approach for imputing missing values in longitudinal data. The userwritten package for this approach, twofold, is available on the Statistical Software Components (SSC) archive. We will present the methods used to perform the study and results from these comparisons. Catherine Welch
Research Department of Epidemiology and Public Health, UCL
Martin Shipley
Research Department of Epidemiology and Public Health, UCL
Séverine Sabia
INSERM U1018, Centre for Research in Epidemiology and Population Health, Villejuif, France
Eric Brunner
Research Department of Epidemiology and Public Health, UCL
Mika Kivim
Research Department of Epidemiology and Public Health, UCL

Friday, September 9
9:30–10:00 
Abstract:
In this presentation, I discuss the new Stata command
xtdpdqml, which implements the unconditional
quasimaximum likelihood estimators of Bhargava and
Sargan (1983, Econometrica 51: 1635–1659) for
linear dynamic panel models with random effects and
of Hsiao, Pesaran, and Tahmiscioglu (2002, Journal of
Econometrics 109: 107–150) for linear dynamic panel
models with fixed effects when the number of crosssections
is large and the time dimension is fixed.
The marginal distribution of the initial observations is modeled as a function of the observed variables to circumvent a shortT dynamic paneldata bias. Robust standard errors are available following the arguments of Hayakawa and Pesaran (2015, Journal of Econometrics 188: 111–134). xtdpdqml also supports standard postestimation commands, including suest, which can be used for a generalized Hausman test to discriminate between the dynamic randomeffects and the dynamic fixedeffects model. Sebastian Kripfganz
University of Exeter Business School

10:00–10:30 
Abstract:
Incorporating covariates in (income or wage)
distribution analysis typically involves estimating
conditional distribution models, that is, models for the
cumulative distribution of the outcome of interest
conditionally on the value of a set of covariates. A
simple strategy is to estimate a series of binary
outcome regression models for \(F(zx_i)= {\rm Pr}(y_i
\le z x_i)\) for a grid of values for \(z\) (Peracchi and
Foresi, 1995, Journal of the American Statistical Association;
Chernozhukov et al., 2013, Econometrica)
This approach now often referred to as
"distribution regression" is attractive and easy to
implement. This talk illustrates how the Stata commands
margins and suest can be useful for inference
here and suggests various tips and tricks to speed up
the process and solve potential computational issues.
It also shows how to use conditional distribution model
estimates to analyze various aspects of unconditional
distributions.
Philippe Van Kerm
Luxembourg Institute of SocioEconomic Research

10:30–10:45 
Abstract:
SDMX, which stands for Statistical Data and Metadata
eXchange, is a standard developed by seven international
organizations (BIS, ECB, Eurostat, IMF, OECD, the United
Nations, and the World Bank) to facilitate the exchange
of statistical data (https://sdmx.org/).
The package sdmxuse aims at helping Stata users to
download SDMX data directly within their favorite
software. The program builds and sends a query to the
statistical agency (using RESTful web services), then
imports and formats the downloaded dataset (in XML
format). Some initiatives, notably the SDMX connector by
Attilio Mattiocco at the Bank of Italy
(https://github.com/amattioc/SDMX),
have already been
implemented to facilitate the use of SDMX data for
external users, but they all rely on the Java programming
language. Formatting the data directly within Stata has
proved to be quicker for large datasets, but it also
offers a simpler way for users to address potential
bugs. The last argument is of particular importance for
a standard that is evolving relatively fast.
The presentation will include an explanation of the functioning of the sdmxuse program as well as an illustration of its usefulness in the context of macroeconomic forecasting. Since the seminal work of Stock and Watson (2002), factor models have become widely used to compute early estimates (nowcasting) of macroeconomic series (for example, Gross Domestic Product). More recent works (for example, Angelini et al. 2011) have shown that regressions on factors extracted from a large panel of time series outperform traditional bridge equations. But this trend has increased the need for datasets with many time series (often more than 100) that are updated immediately after new releases are made available (that is, almost daily). The package sdmxuse should be of interest for users wanting to work on the development of such models. Angelini, E., G. CambaMendez, D. Giannone, L. Reichlin, and G. Rünstler. 2011. Shortterm forecasts of euro area GDP growth. Econometrics Journal 14: 25–44. Stock, J. H., and M. W. Watson. 2002. Forecasting using principal components from a large number of predictors. Journal of the American Statistical Association 97: 1167–1179. Sébastien Fontenay
Institut de Recherches Économiques et Sociales, Université catholique de Louvain 
11:15–12:15 
Abstract:
Joint modeling of longitudinal and survivaltime data
has been gaining more and more attention in recent
years. Many studies collect both longitudinal and
survivaltime data. Longitudinal, panel, or
repeatedmeasures data record data measured repeatedly
at different time points. Survivaltime or event history
data record times to an event of interest such as death
or onset of a disease. The longitudinal and
survivaltime outcomes are often related and should thus
be analyzed jointly. Three types of joint analysis may
be considered: 1) evaluation of the effects of
timedependent covariates on the survival time; 2)
adjustment for informative dropout in the analysis of
longitudinal data; and 3) joint assessment of the
effects of baseline covariates on the two types of
outcomes. In this presentation, I will provide a brief
introduction to the methodology and demonstrate how to
perform these three types of joint analysis in Stata.
Yulia Marchenko
StataCorp

12:15–12:45 
Abstract:
Modeling within competing risks is increasing in
prominence as researchers are becoming more interested
in realworld probabilities of a patient's risk of dying
from a disease while also being at risk of dying from
other causes. Interest lies in the causespecific
cumulative incidence function (CIF), which can be
calculated by (1) transforming on the causespecific
hazards (CSH) or (2) through its direct relationship
with the subdistribution hazards (SDH).
We expand on current competing risks methodology within the flexible parametric survival modeling framework and focus on approach (2), which is more useful when we look to questions on prognosis. These can be parameterized through direct likelihood inference on the causespecific CIF (Jeong and Fine 2006), which offers a number of advantages over the more popular Fine and Gray (1999) modeling approach. Models have also been adapted for cure models using a similar approach described by Andersson et al. (2011) for flexible parametric relative survival models. An estimation command, stpm2cr, has been written in Stata that is used to model all causespecific CIFs simultaneously. Using SEER data, we compare and contrast our approach with standard methods and show that many useful outofsample predictions can be made after fitting a flexible parametric SDH model, for example, CIF ratios and CSH. Alternative link functions may also be incorporated such as the logit link leading to proportional odds models and models can be easily extended for timedependent effects. We also show that an advantage of our approach is that it is less computationally intensive, which is important, particularly when analyzing larger datasets. References: Andersson, T. ML., P. W.Dickman, S. Eloranta, and P. C. Lambert. 2011. Estimating and modelling cure in populationbased cancer studies within the framework of flexible parametric survival models. BMC Medical Research Methodology 11(1): 96. doi: 10.1186/147122881196. Fine, J. P., and R. J. Gray. 1999. A proportional hazards model for the subdistribution of a competing risk. Journal of the American Statistical Association 446: 496–509. Jeong, JH., and J. P. Fine. 2006. Direct parametric inference for the cumulative incidence function. Applied Statistics 55: 187–200. Sarwar Islam
University of Leicester
Paul C. Lambert
University of Leicester and Karolinska Institutet, Stockholm
Mark J. Rutherford
University of Leicester

1:45–3:00 
Abstract:
Simulation studies are an invaluable tool for
statistical research, particularly for the evaluation of
a new method or comparison of competing methods.
Simulations are well used by methodologists but often
conducted or reported poorly, and are underused by
applied statisticians. It's easy to execute a simulation
study in Stata, but it's at least as easy to do it
wrong.
We will describe a systematic approach to getting it right, visiting the following:
Tim Morris
MRC Clinical Trials Unit at UCL
Ian White
MRC Biostatistics Unit, Cambridge
Michael Crowther
University of Leicester

3:00–3:30 
Abstract:
The statistical analysis of longitudinal randomized
clinical trials is frequently complicated by the
occurrence of protocol deviations that result in
incomplete datasets for analysis. However one
approaches analysis, an untestable assumption about the
distribution of the unobserved postdeviation data must
be made. In such circumstances, it is important to assess
the robustness of trial results from primary analysis to
different credible assumptions about the distribution of
the unobserved data.
Referencebased multipleimputation procedures allow trialists to assess the impact of contextually relevant qualitative missing data assumptions (Carpenter, Roger, and Kenward 2013). For example, in a trial of an active versus placebo treatment, missing data for active patients can be imputed following the distribution of the data in the placebo arm. I present the mimix command, which implements the referencebased multipleimputation procedures in Stata, enabling relevant accessible sensitivity analysis of trial datasets. Carpenter, J.R., J. H. Roger, and M. G. Kenward. 2013. Analysis of longitudinal trials with protocol deviation: a framework for relevant, accessible assumptions, and inference via multiple imputation. Journal of Biopharmaceutical Statistics 23(6):1352–71. Suzie Cro
MRC Clinical Trials Unit at UCL and London School of Hygiene and Tropical Medicine

4:00–4:30 
Abstract:
Parallel computing has promised to deliver faster
computing for everyone using offtheshelf multicore
computers. Despite proprietary implementation of new
routines in Stata/MP, the time required to conduct
computationally intensive tasks such as bootstrapping,
simulation, and multiple imputation hasn't dramatically
improved.
One strategy to speed up computationally intensive tasks is to use distributed high performance computer clusters (HPC). Using HPCs to speed up computationally intensive tasks typically involves a divide and conquer approach. This simply divides repetitive tasks and distributes them across multiple processors and combines the results independently at the end of the process. The ability to access such clusters is limited; however, a similar system can be implemented on your desktop PC using the userwritten command qsub. qsub provides a wrapper that writes, submits, and monitors jobs submitted to your desktop PC and that may dramatically improve the speed in which frequent computationally intensive tasks are achieved. Adrian Sayers
Musculoskeletal Research Unit, University of Bristol

4:30–close 
StataCorp

Organizers
Scientific committee
Nicholas J. Cox
Durham University
Patrick Royston
MRC Clinical Trials Unit at UCL
Tim Morris
MRC Clinical Trials Unit at UCL
Logistics organizer
The logistics organizer for the 2016 London Stata Users Group meeting is Timberlake Consultants, the distributor of Stata of Stata in the UK, Ireland, and Eire.
For more information about the 2016 Stata Users Group meeting, visit the official website.
View the proceedings of previous Stata Users Group meetings.