The 2016 Spanish Stata Users Group meeting was October 20, but you can still interact with the user community even after the meeting and learn more about the presentations shared.


Dealing with endogeneity using Stata
Abstract: Stata has multiple estimators that account for endogeneity. I will briefly discuss these estimators and their assumptions. However, my main focus will be to talk about estimators that account for endogeneity that are not in Stata and can be implemented using gsem and gmm.
Enrique Pinzón
xtdpdml for estimating dynamic panel models
Abstract: Panel data make it possible to both control for unobserved confounders and include lagged, endogenous regressors. Trying to do both at the same time, however, leads to serious estimation difficulties. In the econometric literature, these problems have been solved by using lagged instrumental variables together with the generalized method of moments (GMM). In Stata, commands such as xtabond and xtdpdsys have been used for these models. Here we show that the same problems can be addressed via maximum likelihood estimation implemented with Stata's structural equation modeling (sem) command. We show that the ML (sem) method is substantially more efficient than the GMM method when the normality assumption is met and suffers less from finite sample biases. We introduce a command named xtdpdml with syntax similar to other Stata commands for linear dynamic panel-data estimation. xtdpdml simplifies the SEM model specification process; makes it possible to test and relax many of the constraints that are typically embodied in dynamic panel models; and takes advantage of Stata's ability to use full information maximum likelihood (FIML) for dealing with missing data.

Note also that a preliminary version of the command is already available at https://www3.nd.edu/~rwilliam/dynamic.
Enrique Moral-Benito
Banco de España
Richard Williams
University of Notre Dame
Paul Allison
University of Pennsylvania
Computing functional urban areas using a hierarchical travel time approach: An applied case in Ecuador
Abstract: I present an applied case of how to shape integrated cities in developing countries, where usually there is no commuting data. I follow OECD's definition of Functional Urban Areas and consider Ecuador as a case study. I use satellite imagery to overcome the problem of suitable administrative data. I identify urban cores by means of Landscan data (satellite-derived density), and I use Google Maps and Open Street Maps to compute isochrones, which are used to build polycentric cities. I follow Ahlfeldt and Wendland (2016) and propose to use a differentiated threshold for every urban core to the definition of their hinterland, where the dimension of the hinterland is positively related to the dimension of the urban core. I also implement two robustness checks by comparing the results of the procedure with the ones resulting from population flows. I use migration flows derived from the 2010 census as an alternative for commuting flows, assuming they include movements within cities but also between them, which calls for an additional distance restriction in the flows algorithm. I derive commuting flows between municipalities by means of the radiation model (Simini et al. 2012 and Masucci et al. 2013), taking into account the advantage of being a parameter-free model. I use several algorithms programmed in Stata to build a consistent picture of Functional Urban Areas in a developing country such as Ecuador without having to use commuting flows. This procedure is close to standard techniques and arises as a good alternative for defining real cities in developing countries.
Moisés Obaco
Universitat de Barcelona
Vicente Royuela
Universitat de Barcelona
Vítores Xavier
Universitat de Barcelona
Real-time prediction of probabilities of inpatient admission for all patients present at the ED at a given moment
Abstract: One of the most common problems in the daily management of specialized care hospitals is the prediction of inpatient admissions originating from the emergency department (ED). In this presentation, I describe the development of a software system for the real-time prediction of probabilities of inpatient admission for all patients present at the ED at a given moment. This software is written in Stata and interacts with the application programming interface (API) of the Weka machine learning software through an ad hoc integration layer. The resulting expert system can be integrated with most software architectures through web services. Ours was the development of classifiers with adequate performance in terms of both discrimination and calibration (goodness-of-fit), reliant on a small number of variables and available in most ED settings right after triage. The Manchester Triage System (MTS) was used in our setting. Discrimination was evaluated with the area under the ROC curve (AUROC). Calibration was evaluated with Hosmer–Lemeshow (HL) 2 and p-values with 10 fixed probability intervals. We used logistic regression (LR) models and models based on an ad hoc ensemble classifier that optimized calibration. A custom method was used for the evaluation of models, with increasingly larger train sets and 12 consecutive test sets of approximately monthly length. This evaluation method produced the results that follow, reported with 95% confidence intervals (CIs). For LR models, average AUROC = 0.8531, 95% CI (0.8501, 0.856 1); for ad hoc ensemble classifier models, average AUROC = 0.8635, 95% CI (0.8605, 0.8665). Average HL 2 were 35.15, 95% CI (32.57, 37.73) for LR models, 10.47; and 11.4, 95% CI (9.10, 13.75) for ad hoc ensemble classifier models. The latter exhibited better calibration than the LR models, with p-values > 0.05 in 10 of the 12 experiments.
Alexander Zlotnik
Universidad Politécnica de Madrid
Stata: The future of educational psychology in universities world wide
Abstract: In this presentation, I will explore how the Department of Educational Psychology at Texas A&M University in College Station, Texas, initiated a positive transition toward Stata with the implementation and use of the program for the first time in the history of the university. The transition was made possible through the design of courses incorporated into the university coursework. I will explore the steps that made this transition possible and the obstacles faced through the trajectory. Additionally, I will speak about student application from students enrolled in master's and doctoral programs. I will expose examples of classroom application and the implications that have generated success in the formation of future researchers in the area of educational psychology.
Elizabeth Stackhouse
Texas A&M University
Custom exams: Generation of unique databases with different outcomes to assess students' statistics skills
Abstract: In this presentation, I show how you can create custom exams using Stata so that the answers cannot be interchanged among those who have to complete them. The procedure requires a database from which different samples are extracted to be distributed to the members of a course in a unique .txt, .xls, or .dta file. While the samples are made, Stata calculates the specific solutions and records them so that the teacher can easily correct the answers that students will have to enter in a spreadsheet. As an example, an exercise will be presented for the selection of the best statements of a Likert scale.
Modesto Escobar
Universidad de Salamanca
Discrete time survival analysis with Stata
Abstract: Researchers often need to analyze time-to-event data where time is measured as a discrete variable. In some situations, the recorded variable contains the actual time values where events happen. However, in most cases, discrete time-to-event data are the result of an underlying continuous process that has been intervalcensored. We'll discuss this problem and the implementation of estimation methods in Stata, including extension to discrete-response models, like multilevel models.

We'll also discuss simulation strategies to visualize when a discrete model is better than a continuous model.
Isabel Cañette
Generic versus alternative specific coefficients in conditional logits: An application to party choice
Abstract: The Spatial Theory of Voting contends that the ideological distance to a party is negatively correlated with the probability to vote for that particular party. Following this logic, political scientists have generally analyzed the effect of ideology on vote choice by estimating conditional logits or other similar applications. However, the implementation of the traditional conditional logit estimates the so-called generic attribute coefficient, implying that the attribute coefficient is valuated identically with regard to all alternatives (parties). This assumption is risky because voters' reactions to issues may vary across parties because each party strategically manipulates the saliencies of selected issues. More concretely, the ideological distance may shape vote choice for some voters but not for others. In this presentation, I show the need to challenge this assumption by "splitting" the generic parameter in conditional logits into alternative-specific parameters. By analyzing vote choice in EU elections in three different countries—Germany, Italy, and Spain—this communication highlights the importance of this statistical identification. I will start by explaining the theoretical basis of the model and its implementation, and I will offer some examples to show how it can be implemented in Stata.
Toni Rodon
Stanford University
The electoral origins of fiscal capacity
Abstract: This presentation theorizes the rise of the modern fiscal state as a by-product of time-inconsistent electoral calculations by incumbent elites with distinctive ideological constituencies. I claim that nineteenth-century parties made myopic political and fiscal decisions that resulted from their weak internal cohesion and organization. Being clubs more than modern party organizations, they did not internalize the long-run policy costs of decisions that maximized their immediate electoral fortunes. The analysis generates novel predictions about the partisan determinants of both the extensions of franchise and the development of fiscal policy, which are tested on a new dataset of parliamentary plurality by party families in 10 European democracies between 1820 and 1975.
Pablo Beramendi
Universitat Pompeu Fabra
Didac Queralt
Universitat Pompeu Fabra
Is my nation cool enough? National identification in difficult economic times
Abstract: Does nationalism increase with economic crisis? This presentation seeks to answer this question and examines whether changes in the nation's economy and in individuals' economic situation affect people's national attitudes. Borrowing from the social identity theory, I argue that people care about their individual status and the status of the group with which they identify. In this way, when individuals' economic status deteriorates, their national identity strengthens. Yet, when the status of the nation depreciates, their reaction weakens their identification with the national group. To test this theory, I combine data from two monographic surveys of the International Social Survey Program (National identity 2003 and 2013) and a six-wave online panel study carried out in Spain between 2009 and 2014 to assess the impact that changes in the nation's status and intraindividual changes in the economic status have on national pride, closeness to the nation, and españolismo (Spanish nationalism).

Results from the cross-country analyses show that closeness to the nation and national pride decrease when the economic status of the nation deteriorates. Results from the panel analyses show that individual changes in the economic situation are related to intraindividual changes in Spanish nationalism. Losses of income translate to a stronger Spanish nationalism but only to people with a lower prior level of it. It also shows that the individual perception of the economic status of the nation matters. When the individual economic assessment of the economy improves over time, the nation is perceived as a more desirable category of identification, leading to a reinforcement of their Spanish nationalism.
María José Hierro
Universitat Autònoma de Barcelona
Electoral forecasting with Stata
Abstract: One of the aims of the social sciences is the prediction of phenomena. Among the more attractive predictions not only to the scientific community but also to the media is the electoral outcome. Basically, there are two methods of predicting. One is through time series in which external factors are introduced; the other is through the use of polls. I discuss how to use Stata to forecast electoral data through questionnaires, given the enormous amount of interference that occurs in the collection of information: from sample designs, problems of nonresponse, to biased statements of future voters. The survey Stata module is used to poststratify using the recall of past vote. I will give special attention to the svy Stata command and its use in two steps to predict the outcome of the general election. I will also present a program that will make predictions without microdata from direct estimates obtained in the polls. All of this is done for the Spanish case, using surveys published in the media and regularly carried out by the Governmental Center for Sociological Research.
Modesto Escobar
Universidad de Salamanca
Pablo Cabrera
NatCen Social Research (London), Universidad de Salamanca

Universitat Pompeu Fabra
Plaça de la Mercè, 10-12
08002 Barcelona

Participants must travel at their own expense. The registration fee covers the meeting, lunch, and coffee breaks. There will also be an optional dinner at a restaurant to be announced at an additional cost of approximately 35 euros.


Scientific committee


Raul Ramos
Universitat de Barcelona

Vincente Royuela
Universitat de Barcelona


Josep Maria Domenech
Universitat Autònoma de Barcelona

Sociology and Political Science

Modesto Escobar
Universidad de Salamanca

Bruno Arpino
Universitat Pompeu Fabra

Mariano Torcal
Universitat Pompeu Fabra

