Home  /  Stata Conferences  /  2021 United Kingdom

Proceedings

11:00–11:20 Ridits right, left, center, native and foreign Abstract: Ridit functions are specified with respect to an identified probability distribution. They are like ranks, only expressed on a scale from 0 to 1 (for unfolded ridits), or -1 to 1 (for folded ridits). Ridit functions have generalized inverses called percentile functions. A native ridit is a ridit of a variable with respect to its own distribution.
... (Read more)
Native ridits can be computed using the ridit() function of Nick Cox's SSC package egenmore. Alternatively, weighted ridits can be computed using the SSC package wridit. This has a handedness() option, where handedness(right) specifies a right-continuous ridit (also known as a cumulative distribution function), handedness(left) specifies a left-continuous ridit, and handedness(center) (the default) specifies a ridit function discontinuous at its mass points. wridit now has a module fridit, computing foreign ridits of a variable with respect to a distribution other than its own, specifying the foreign distribution in another data frame. An application of ridits is ridit splines, which are splines in a ridit function, typically computed using the SSC package polyspline. As an example, we may fit a ridit spline to a training set and use it for prediction in a test set, using foreign ridits of an X variable in the test set with respect to the distribution of the X variable in the training set. The model parameters are typically values of an outcome variable corresponding with percentiles of the X variable in the training set. This practice stabilizes (or Winsorizes) outcome values corresponding with X values in the test set outside the range of X values in the training set.

(Read less)

Additional information:
UK21_Newson.zip

Roger Newson
King's College London
11:20–11:40 The production process of the global MPI Abstract: The Global Multidmensional Poverty Index (MPI) is a cross-country poverty measure published by the Oxford Poverty and Human Development Initiative since 2010. The estimation requires household survey data because multidimensional poverty measures seek to exploit the joint distribution of deprivations in the identification step of poverty measurement.
... (Read more)
Analyses of multidimensional poverty draw on several aggregate measures (for example, the headcount ratio), dimensional quantities (for example, indicator contributions), and auxiliary statistics (for example, nonresponse rates). Robustness analyses of key parameters (for example, poverty cutoffs) and several levels of analysis (for example, subnational regions) further increase the number of estimates.

In 2018, the underlying workflow has been revised and subjected to continuous development, which for the first time allowed figures to be calculated for 105 countries in a single round. In 2021, this workflow was substantially expanded to include the estimation of changes over time. In 2021, the regular global MPI release includes 109 countries (with 1291 subnational regions), whereas changes over time are provided for 84 countries with 793 subnational regions over up to three years. In total, this release builds on 220 micro datasets.

For a large-scale project like this, a clear and efficient workflow is essential. This presentation introduces key elements of the workflow and presents solutions with Stata for particular problems, including the structure of a comprehensive results file, which facilitates both analysis and production of deliverables, the usability of the estimation files, the collaborative nature of the project, the country briefing production, and how some of the additional challenges introduced by the incorporation of changes over time have been addressed so far. This presentation seeks to share the gained experience and to subject both the principal workflow and the selected solutions to public scrutiny.

(Read less)

Additional information:
UK21_Suppa.zip

Nicolai Suppa
Autonomous University of Barcelona
11:40–12:00 A unified Stata package for calculating sample sizes for trials with binary outcomes (artbin) Abstract: Sample-size calculation is essential in the design of a randomized clinical trial in order to ensure that there is adequate power to evaluate treatment. It is also used in the design of randomized experiments in other fields such as education, international development, and social science. We describe the command artbin, which calculates sample size or power for a clinical trial or similar experiment with a binary outcome.
... (Read more)
A particular feature of artbin is that it can be used to design noninferiority (NI) and substantial-superiority (SS) trials. Noninferiority trials are used in the development of new treatment regimes to test whether the experimental treatment is no worse than an existing treatment by more than a prespecified amount. NI trials are used when the intervention is not expected to be superior but has other benefits such as offering a shorter, less complex regime that can reduce the risk of drug-resistant strains developing, of particular concern for countries without robust healthcare systems. We illustrate the command’s use in the STREAM trial, an NI design that demonstrated a shorter, more intensive treatment for multidrug-resistant tuberculosis was only 1% less effective than the lengthier treatment recommended by the World Health Organization. artbin also differs from the offical power command by allowing a wide range of statistical tests (score, Wald, conditional, trend across K groups) and offering calculations under local or distant alternatives with or without continuity correction. artbin has been available since 2004, but recent updates include clearer syntax, clear documentation, and some new features.
Contributors:
Ian R. White
Mahesh K. B. Parmar
Patrick Royston
Abdel G. Babiker
University College London
(Read less)

Additional information:
UK21_Marley-Zagar.pptx

Ella Marley-Zagar
University College London
12:00–12:15 Instrumental-variable estimation of large-T panel-data models with common factors Abstract: We introduce the xtivdfreg command in Stata, which implements a general instrumental-variables (IV) approach for fitting panel-data models with a large number of time-series observations, T, and unobserved common factors or interactive effects, as developed by Norkute, Sarafidis, Yamagata, and Cui (2021, Journal of Econometrics) and Cui, Norkute, Sarafidis, and Yamagata (2020, ISER Discussion Paper).
... (Read more)
The underlying idea of this approach is to project out the common factors from exogenous covariates using principal components analysis and to run IV regression in both of two stages, using defactored covariates as instruments. The resulting two-stage IV (2SIV) estimator is valid for models with homogeneous or heterogeneous slope coefficients and has several advantages relative to existing popular approaches. In addition, the xtivdfreg command extends the 2SIV approach in two major ways. First, the algorithm accommodates estimation of unbalanced panels. Second, the algorithm permits a flexible specification of instruments. It is shown that when one imposes zero factors, the xtivdfreg command can replicate the results of the popular ivregress Stata command. Notably, unlike ivregress, xtivdfreg permits estimation of the two-way error-components panel-data model with heterogeneous slope coefficients.

Contributor:
Vasilis Sarafidis
BI Norwegian Business School
(Read less)

Additional information:
UK21_Kripfganz.pdf

Sebastian Kripfganz
University of Exeter Business School
12:15–12:30 Estimating causal effects in the presence of competing events using regression standardization with the Stata command standsurv Abstract: When interested in a time-to-event outcome, competing events that prevent the occurrence of the event of interest may be present. In the presence of competing events, various statistical estimands have been suggested for defining the causal effect of treatment on the event of interest.
... (Read more)
Depending on the estimand, the competing events are either accommodated (total effects) or eliminated (direct effects), resulting in causal effects with different interpretation. Separable effects can also be defined for settings where the treatment effect can be partitioned into its effect on the event of interest and its effect on the competing event through different causal pathways. We outline various causal effects of interest in the presence of competing events, including total, direct, and separable effects, and describe how to obtain estimates using regression standardization with the Stata command standsurv. Regression standardization is applied by obtaining the average of individual estimates across all individuals in a study population after fitting a survival model. standsurv supports several models, including flexible parametric models. With standsurv, several contrasts can be calculated: differences, ratios, and other user-defined functions. Confidence intervals are obtained using the delta method. Throughout, we use an example analyzing a publicly available dataset on prostate cancer.

Contributors:
Sarwar I. Mozumder
Mark J. Rutherford
University of Leicester
Paul C. Lambert
Karolinska Institutet & University of Leicester
(Read less)

Additional information:
UK21_Syriopoulou.pdf

Elisavet Syriopoulou
Karolinska Institutet
1:00–1:20 Covariate adjustment in a randomized trial with time-to-event outcomes Abstract: Covariate adjustment in a randomized trial aims to provide more powerful comparisons of randomized groups. We describe the challenges of planning how to do this in the ODYSSEY trial, which compares two HIV treatment regimes in children.
... (Read more)
ODYSSEY presents three challenges: (1) the outcome is time-to-event (time to virological or clinical failure); (2) interest is in the risk at a landmark time (96 weeks after randomization); and (3) the aim is to demonstrate noninferiority (defined as the risk difference at 96 weeks being less than 10 percentage points). The statistical analysis plan is based on the Cox model with predefined adjustment for three covariates. We describe how to use the margins command in Stata to estimate the marginal risks and the risk difference. This analysis does not allow for uncertainty in the baseline survivor function. We compare confidence intervals produced by normal theory and by bootstrapping and (for the risks) using the log-log transform. We compare these methods with Paul Lambert's standsurv, which is based on a parametric survival model. We also discuss an inverse probability of the treatment weighting approach, where the weights are derived by regressing randomized treatment on the covariates.

Contributors:
Tim P. Morris
Deborah For
MRC Clinical Trials Unit at UCL
(Read less)

Additional information:
UK21_White.pptx

Ian R. White
MRC Clinical Trials Unit at UCL
1:20–1:40 Gravitational effects of culture on internal migration in Brazil Abstract: This presentation conducts empirical research about the role of culture on internal migration in Brazil. To do so, we deploy data from the Latin American Public Opinion Project (LAPOP) and the 2010 Brazilian Census.
... (Read more)
Against the background of the gravitational model, we adopt the method Poisson Pseudo&endash;Maximum Likelihood with Fixed Effects (PPMLFE) to account for econometric issues. The results obtained provide new evidence on the influence of the migrant’s perceptions about the push-pull factors of Brazilian municipalities. Traditionally, gravitational models apply features such as gross domestic product per capita, unemployment rate, and population density to measure the attractiveness of cities. All in all, these insights on the migrant’s traits and perceptions about culture pave the way to design-appropriate migration policies at the municipal level once migration supports, among others, renewal of the socioeconomic tissue.

Contributor:
Philipp Ehrl
Universidade Católica de Brasília

References:

Akerlof, G. A. 1997. Social distance and social decisions. Econometrica: Journal of the Econometric Society 1005–1027.

Alesina, A., R. Baqir, and W. Easterly. 1999. Public goods and ethnic divisions. The Quarterly Journal of Economics 114: 1243–1284.

Alesina, A., R. Baqir, W. Easterly, and E. La Ferrara. 2002. Who trusts others? Journal of Public Economics 85: 207–234.

Anderson, J. E. 2011. The gravity model. Annu. Rev. Econ. 3: 133–160.

Anderson, J. E., M. Larch, and Y. V. Yotov. 2018. geppml: General equilibrium analysis with ppml. The World Economy 41: 2750–2782.

Correia, S., P. Guimarães, and T. Zylkin. 2019a. ppmlhdfe: Fast poisson estimation with high-dimensional fixed effects. arXiv preprint arXiv:1903.01690.

Correia, S., P. Guimarães, and T. Zylkin. 2019b. ppmlhdfe: Stata module for poisson pseudo-likelihood regression with multiple levels of fixed effects. Boston College Department of Economics.

Kogut, B. and H. Singh. 1988. The effect of national culture on the choice of entry mode. Journal of International Business Studies 19: 411–432.

Molloy, R., C. L. Smith, and A. Wozniak. 2011. Internal migration in the United States. Journal of Economic Perspectives 25: 173–196.

Silva, J. S. and S. Tenreyro. 2006. The log of gravity. The Review of Economics and Statistics 88: 641–658.

Weber, S. and M. Péclat. 2017. A simple command to calculate travel distance and travel time. The Stata Journal 17: 962–971.

(Read less)

Additional information:
UK21_Lima.pdf

Daisy Assmann Lima
Universidade Católica de Brasília
1:40–2:00 Two-stage sampling in the estimation of growth parameters and percentile norms: Sample weights versus auxiliary variable estimation Abstract: Background: The use of auxiliary variables with maximum-likelihood parameter estimation for surveys that miss data by design is not a widespread approach. Although efficiency gains from the incorporation of normal auxiliary variables in a model have been recorded in the literature, little is known about the effects of nonnormal auxiliary variables in the parameter estimation.
... (Read more)
Methods: We simulate growth data to mimic SCALES, a two-stage longitudinal survey of language development. We allow a fully observed Poisson stratification criterion to be correlated with the partially observed model responses and develop five models that host the auxiliary information from this criterion. We compare these models with each other and with a weighted model in terms of bias, efficiency, and coverage. We apply our best performing model to SCALES data and show how to obtain growth parameters and population norms.

Results: Parameter estimation from a model that incorporates a nonnormal auxiliary variable is unbiased and more efficient than its weighted counterpart. The auxiliary variable method can produce efficient population percentile norms and velocities.

Conclusions: When a fully observed variable that dominates the selection of the sample and that is strongly correlated with the incomplete variable of interest exists, its utilization appears beneficial.

(Read less)

Additional information:
UK21_Vamvakas.pdf

George Vamvakas
King's College London
2:00–3:00 Integrating R machine-learning algorithms in Stata using rcall: A tutorial Abstract: rcall is a Stata package that integrates R and R packages in Stata and supports seamless two-way data communication between R and Stata. The package offers two modes of data communication: interactive and noninteractive.
... (Read more)
In the first part of the presentation, I will introduce the latest updates of the package (version 3.0) and how to use it in practice for data analysis (interactive mode). The second part of the presentation concerns developing Stata packages with rcall (noninteractive mode) and how to defensively embed R and R packages within Stata programs. All the examples of the presentation, either for data analysis or for package development, would be based on embedding R machine-learning algorithms in Stata and using them in practice.

(Read less)

Additional information:
UK21_Haghish.pdf

Ebad F. Haghish
University of Oslo
3:20–3:40 A bird's-eye view of Bayesian software in 2021: Opportunities for Stata? Abstract: In this presentation, I will review the range of current software that can be used for Bayesian analysis. By considering the features, interfaces and algorithms; the users and their backgrounds; the popular models; and use cases. I will identify areas where Stata has a strategic or technical advantage and where useful advances can be built into future versions or community-contributed commands without excessive effort.
... (Read more)
Stata has developed Bayesian modeling within the framework of its own ado syntax, which has some strengths (for example, the bayes: prefix on familiar and tested commands) and some weaknesses (for example, the limitations to specifying a complex bespoke likelihood or prior). On the other hand, there are Stata components such as the SEM Builder GUI that would potentially be very popular with beginners in Bayes if they were adapted. I will also examine the concept of a probabilistic programming language to specify a model in linked conditional formulas and probability distributions and how it can work with Stata.

(Read less)

Additional information:
UK21_Grant.pdf

Robert Grant
BayesCamp Ltd
3:40–4:00 Using xtbreak to study the impacts of European Central Bank announcements on sovereign borrowing Abstract: This presentation investigates how the announcements of the European Central Bank have impacted the cost of sovereign borrowing in central and peripheral European countries.
... (Read more)
Using the xtbreak command (Ditzen, Karavias, and Westerlund 2021) in Stata, we tested whether the variations of European sovereign spreads can be explained by economic fundamentals in a model that allows for two structural breaks: the first, when investors realized the fiscal sustainability of the EMU should be understood in a decentralized fashion when the ECB announced it would not bail out Greece; the second, when the ECB realized the existence of the euro was in check and announced it would be able to financially assist the countries in financial trouble. We show that a model that allows for structural breaks after the ECB announcements can explain most of the variations in European sovereign spreads.

(Read less)

Additional information:
UK21_Poiatti.pdf

Natalia Poiatti
USP
4:00–5:00 PyStata: Python and Stata integration Abstract: Stata 16 introduced tight integration with Python, allowing users to embed and execute Python code from all of Stata's programming environments, such as the Command window, do-files, and ado-files. Stata 17 introduced the pystata Python package.
... (Read more)
With this package, users can call Stata from various Python environments, including Jupyter Notebook, Jupyter Lab, Spyder IDE, PyCharm IDE, and system command-line environments that can access Python (Windows Command Prompt, macOS terminal, Unix terminal). In this presentation, I will introduce two ways to run Stata from Python: the IPython magic commands and a suite of API functions. I will then demonstrate how to use them to seamlessly pass data and results between Stata and Python.

(Read less)

Additional information:
UK21_Xu.pdf

Zhao Xu
StataCorp
11:00–11:20 Computing score functions numerically using Mata Abstract: Specific econometric models—such as the Cox regression, conditional logistic regression, and panel-data models—have likelihood functions that do not meet the so-called linear-form requirement. That means that the model's overall log-likelihood function does not correspond with the sum of each observation's log-likelihood contribution.
... (Read more)
Stata's ml command can fit said models using a particular group of evaluators: the d family evaluators. Unfortunately, they have some limitations; one is that we cannot directly produce the score functions from the postestimation command predict. This missing feature triggers the need for tailored computational routines from developers that might need those functions to compute, for example, robust variance&endash;covariance matrices. In this presentation, I present a way to compute the score functions numerically using Mata's deriv() function with minimum extra programming other than the log-likelihood function. The procedure is exemplified by replicating the robust variance–covariance matrix produced by the clogit command using simulated data. The results show negligible numerical differences (e-09) between the clogit robust variance–covariance matrix and the numerically approximated one using Mata's deriv() function.

(Read less)

Additional information:
UK21_Gutiérrez-Vargas.pdf

Álvaro A. Gutiérrez-Vargas
KU Leuven
11:20–11:40 Analyzing conjoint experiments in Stata: The conjoint command Abstract: This talk presents conjoint, a new Stata command for analyzing and visualizing conjoint (factorial) experiments in Stata.
... (Read more)
Using examples of conjoint experiments from the growing literature—including two from political science involving choices between immigrants (Hainmueller et al. 2014) and between return locations for refugees (Ghosn et al. 2021)—I will briefly explain conjoint experiments and how they are used. Then, and with reference to existing packages and commands in other software, I will explain how conjoint functions to estimate and visualize the two common estimands: average marginal component effects (AMCE) and marginal means (MM). Limitations of conjoint and possible improvements to the command will also be discussed.

(Read less)

Additional information:
UK21_Frith.pdf

Michael J. Frith
University College London
11:40–12:00 Introducing stipw: inverse probability weighted parametric survival models Abstract: Inverse probability weighting (IPW) can be used to estimate marginal treatment effects from survival data. Currently, IPW analyses can be performed in a few steps in Stata (with robust or bootstrap standard errors) or by using stteffects ipw under some assumptions for a small number of marginal treatment effects.
... (Read more)
stipw has been developed to perform an IPW analysis on survival data and to provide a closed-form variance estimator of the model parameters using M estimation. This method appropriately accounts for the estimation of the weights and provides a less computationally intensive alternative to bootstrapping. stipw implements the following steps: (1) A binary treatment/exposure variable is modeled against confounders using logistic regression. (2) Stabilized or unstabilized weights are estimated. (3) A weighted streg or stpm2 (Royston—Parmar) survival model is fit with treatment/exposure as the only covariate. (4) Variance is estimated using M estimation. As the stored variance matrix is updated, postestimation can easily be performed with the appropriately estimated variance. Useful marginal measures, such as difference in marginal restricted survival time, can thus be calculated with uncertainties. stipw will be demonstrated on a commonly used dataset in primary biliary cirrhosis. Robust, bootstrap and M estimation standard errors will be presented and compared.

Contributors:
Paul C. Lambert
University of Leicester & Karolinska Institutet
Michael J. Crowther
Karolinska Institutet
(Read less)

Additional information:
UK21_Hill.pptx

Micki Hill
University of Leicester
12:00–12:15 The Stata module for CUB models for rating data analysis Abstract: Many survey questions are addressed as ordered rating variables to assess the extent by which a certain perception or opinion holds among respondents.
... (Read more)
These responses cannot be treated as objective measures, and a proper statistical analysis to account for their fuzziness is the class of CUB models, (acronym of combination of uniform and shifted binomial) (Piccolo and Simone 2019), establishing a different paradigm to model both individual perception (feeling) towards the items and uncertainty. Uncertainty can be considered as a noise for feeling measurement taking the form of heterogeneity of the distribution. CUB models are specified via a two-component discrete mixture to combine feeling and uncertainty modeling. In the baseline version, a shifted binomial distribution accounts for the underlying feeling and a discrete uniform accounts for heterogeneity, but different specifications are possible to encompass inflated frequencies, for instance. Featuring parameters can be linked to subjects' characteristics to derive response profiles. Then different items (possibly measured on scales with different lengths) and groups of respondents can be represented and compared through effective visualization tools. Our contribution is tailored to present CUB modeling to the Stata community by discussing the CUB module with different case studies to illustrate its applicative extent.

Contributors:
Christopher F. Baum
Boston College
R. Simone
F. Di Iorio
D. Piccolo
University of Naples Federico II
(Read less)

Additional information:
UK21_Cerulli.pdf

Giovanni Cerulli
IRCrES-CNR
12:15–12:30 A robust regression estimator for pairwise-difference transformed data: xtrobreg Abstract: Pairwise comparison-based estimators are commonly used in statistics. In the context of panel-data fixed-effects estimations, Aquaro and Cizek (2013) have shown that a pairwise-differences&endash;based estimator is equivalent to the well-known within estimator.
... (Read more)
Relying on this result, they propose to "robustify" the FE estimator by applying a robust regression estimator to pairwise-difference transformed data. In collaboration with Ben Jann, we made available the xtrobreg command that implements this estimator in Stata for both balanced and unbalanced panels. As will be shown in the presentation, the flexibility of the xtrobreg command allows it to be used well beyond the context of panel robust regressions.

Contributor:
Ben Jann
University of Bern
(Read less)

Additional information:
UK21_Verardi.pdf

Vincenzo Verardi
Université Libre de Bruxelles
1:00–1:20 Drivers of COVID-19 outcomes: Evidence from a heterogeneous SAR panel-data model Abstract: In an extension of the standard spatial autoregressive (SAR) model, Aquaro, Bailey and Pesaran (ABP, Journal of Applied Econometrics 2021) introduced a SAR panel model that allows for the production of heterogeneous point estimates for each spatial unit.
... (Read more)
Their methodology has been implemented as the Stata routine hetsar (Belotti, 2021). Because the COVID-19 pandemic has evolved in the U.S. since its first outbreak in February 2020 with following resurgences of multiple widespread and severe waves of the pandemic, the level of interactions between geographic units (for example, states and counties) have differed greatly over time in terms of the prevalence of the disease. Applying ABP’s HETSAR model to 2020 and 2021 COVID-19 data outcomes (confirmed case and death rates) at the state level, we extend our previous spatial econometric analysis (Baum and Henry 2021) on socioeconomic and demographic factors influencing the spatial spread of COVID-19 confirmed case and death rates in the U.S.A.

Contributor:
Miguel Henry
Greylock McKinnon Associates
(Read less)

Additional information:
UK21_Baum.pdf

Christopher F. Baum
Boston College, DIW Berlin, and CESIS
1:20–1:40 Panel unit-root tests with structural breaks Abstract: This presentation introduces a new Stata command, xtbunitroot, which implements the panel-data unit-root tests developed by Karavias and Tzavalis (2014).
... (Read more)
These tests allow for one or two structural breaks in deterministic components of the series and can be seen as panel-data counterparts of the tests by Zivot and Andrews (1992) and Lumsdaine and Papell (1997). The dates of the breaks can be known or unknown. The tests allow for intercepts and linear trends, nonnormal errors, cross-section heteroskedasticity, and dependence. They have power against homogeneous and heterogeneous alternatives and can be applied to panels with small or large time-series dimensions. We will describe the econometric theory and illustrate the syntax and options of the command with some empirical examples.

Contributor:
Elias Tzavalis
University of Birmingham
(Read less)

Additional information:
UK21_Chen.pdf

Pengyu Chen
Yiannis Karavias
1:40–2:00 rbiprobit: Recursive bivariate probit estimation and decomposition of marginal effects Abstract: This presentation describes Stata's new rbiprobit command for fitting recursive bivariate probit models, which differ from bivariate probit models in allowing the first dependent variable to appear on the right-hand side of the second dependent variable.
... (Read more)
Although the estimation of model parameters does not differ from the bivariate case, the existing commands biprobit and cmp do not consider the structural model’s recursive nature for postestimation commands. rbiprobit estimates the model parameters, computes treatment effects of the first dependent variable, and gives the marginal effects of independent variables. In addition, marginal effects can be decomposed into direct and indirect effects if covariates appear in both equations. Moreover, the postestimation commands incorporate the two community-contributed goodness-of-fit tests scoregof and bphltest. Dependent variables of the recursive probit model may be binary, ordinal, or a mixture of both. I present and explain the rbiprobit command and the available postestimation commands using data from the European Social Survey.

(Read less)

Additional information:
UK21_Coban.pdf

Mustafa Coban
Institute for Employment Research (IAB)
2:00–3:00 Difference in differences in Stata 17 Abstract: Stata 17 introduced two commands to fit difference-in-differences (DID) models and difference-in-difference-in-differences (DDD) models. One of the commands, didregress, is for repeated cross-section models, and the other command, xtdidregress, is for longitudinal or panel data.
... (Read more)
In this presentation, I will briefly talk about the theory of DID and DDD, and then there will be a practical application about how to fit the models using the new commands. Likewise, some aspects related to standard errors that are appropriate under different scenarios will be addressed. Graphical diagnostics and tests relevant to the DID and DDD specifications, as well as new areas of development in the DID literature, will also be discussed.

(Read less)

Additional information:
UK21_Pinzón.pdf

Enrique Pinzón
StataCorp
3:20–3:40 Graphics for ordinal outcomes or predictors Abstract: Ordered or ordinal variables, such as opinion grades from strongly disagree to strongly agree, are common in many fields and a leading data type in some. Alternatively, orderings may be sought in the data.
... (Read more)
In archaeology and various environmental sciences, there is a problem of seriation, at its simplest finding the best ordering of rows and columns given a data matrix. For example, the goal may be to place archaeological sites in approximate date order according to which artifacts have been found where. Graphics for such data may appear to range from obvious but limited (draw a bar chart if you must) to more powerful but obscure (enthusiasts for complicated mosaic plots or correspondence analyses need to convince the rest of us). Alternatively, graphics are avoided and the focus is only on tabular model output with estimates, standard errors, p-values, and so forth. The need for descriptive or exploratory graphics remains. This presentation surveys various graphics commands by the author, made public through the Stata Journal or SSC (that should not seem too esoteric) principally friendlier and more flexible bar charts, and dedicated distribution or quantile plots. Specific commands include tabplot, floatplot, qplot, and distplot. Mapping grades to scores and considering frequencies, probabilities, or cumulative probabilities on transformed scales are also discussed as simple strategies.

(Read less)

Additional information:
UK21_Cox.zip

Nicholas J. Cox
University of Durham
3:40–4:40 Advanced data visualizations with Stata Abstract: The presentation will cover innovative use of Stata to create data visualizations that can compete with standard industrial languages like R and Python.
... (Read more)
Several existing and new concepts like heat plots, stacked area graphs, fully customized maps, streamplots, joy plots, polar plots, spider graphs, and several new visualization templates currently under development will be showcased. The presentation will also discuss the importance of customized color schemes to finetune the graphs. Propositions for improvements in Stata will be highlighted.

(Read less)

Additional information:
UK21_Naqvi.pdf

Asjad Naqvi
International Institute for Applied Systems Analysis (IIASA)
4:00–4:30 Open panel discussion with Stata developers
StataCorp

Scientific committee

Stephen Jenkins
London School of Economics and Political Science
Roger Newson
King's College London
Michael Crowther
Karolinska Institutet

Logistics organizer

The logistics organizer for the 2021 UK Stata Conference is Timberlake Consultants, the Stata distributor to the United Kingdom and Ireland, France, Spain, Portugal, the Middle East and North Africa, Brazil, and Poland.

View the proceedings of previous Stata Conferences and Users Group meetings.