2018 Spanish Stata Conference | Stata

Search stata.com

items in cart
Stata/BE network 2-year maintenance

Quantity:

196 Users

Qty: 1

$11,763.00
Subtotal: $0.00

Products

Purchase

Learn

Support

Company

Home / Users Group meetings / 2018 Spain

The Spanish Stata Conference was held on 24 October 2018 at Universitat Pompeu Fabra, Campus Ciutadella, Edifici Mercé Rodoreda, but you can view the program below.

Proceedings

9:30–10:45	Introduction to Bayesian analysis using Stata Abstract: Researchers' interest about the use of Bayesian regression analysis has been significantly increasing in recent years. One of the fundamental reasons for this growing interest is that a wide variety of models can be accommodated within this alternative regression approach. This flexibility is due in part to the possibility of using a common theoretical framework to estimate the parameters for posterior distributions associated with different kinds of model specifications. I will outline the main aspects associated with Bayesian regression in Stata, and I will show the facilities incorporated in Stata 15 to make this kind of analysis more accessible to those who are not very familiar with this approach. Additional information: spain18_Sánchez.pdf Gustavo Sánchez StataCorp
10:45–11:15	Ensemble learning targeted maximum-likelihood estimation for Stata users Abstract: eltmle is a Stata program implementing the targeted maximum-likelihood estimation (TMLE) for the ATE for a binary or continuous outcome and binary treatment. eltmle includes the use of a super learner called from the SuperLearner package v.2.0-21 (Polley et al. 2011). Modern epidemiology has been able to identify significant limitations of classic epidemiological methods, like outcome regression analysis, when estimating causal quantities such as the average treatment effect (ATE) for observational data. For example, using classical regression models to estimate the ATE requires the assumption that the effect measure is constant across levels of confounders included in the model, i.e., that there is no effect modification. Other methods do not require this assumption, including g-methods (for example, the g-formula) and targeted maximum-likelihood estimation (TMLE). The average treatment effect (ATE) or risk difference is the most commonly used causal parameter. Many estimators of the ATE, but not all, rely on parametric modeling assumptions. Therefore, the correct model specification is crucial to obtain unbiased estimates of the true ATE. TMLE is a semi-parametric, efficient substitution estimator allowing for data-adaptive estimation while obtaining valid statistical inference based on the targeted minimum loss-based estimation. TMLE has the advantage of being doubly robust. Moreover, TMLE allows inclusion of machine learning algorithms to minimize the risk of model misspecification, a problem that persists for competing estimators. Evidence shows that TMLE typically provides the least unbiased estimates of the ATE compared with other double robust estimators. The following links provide access to a TMLE tutorial: https://migariane.github.io/TMLE.nb.html and the GitHub repository for the eltmle Stata package, https://github.com/migariane/meltmle. Additional information: spain18_Luque-Fernández(1).pdf Miguel Ángel Luque-Fernández Universidad de Granada, London School of Hygiene and Tropical Medicine, CIBERESP ISCII
11:45–12:15	Text mining with ngram variables Abstract: Text data, such as answers to open-ended questions, are sometimes ignored because they are hard to analyze. My community-contributed Stata command ngram turns text into hundreds of variables using the "bag of words" approach. Broadly speaking, each variable records how often the corresponding word or word sequence occurs in a given text. This is more useful than it sounds. The program supports text in 12 European languages. Additional information: spain18_Schonlau.pdf Matthias Schonlau University of Waterloo
12:15–12:45	Cross-validated area under the oc curve (cvAUROC) Abstract: Receiver operating characteristic (ROC) analysis is used for comparing predictive models, both in model selection and model evaluation. This method is often applied in clinical medicine and social science to assess the tradeoff between model sensitivity and specificity. After one fits a binary logistic regression model with a set of independent variables, the predictive performance of this set of variables—as assessed by the area under the curve (AUC) from a ROC curve—must be estimated for a sample (the "test" sample) that is independent of the sample used to predict the dependent variable (the "training" sample). An important aspect of predictive modeling (regardless of model type) is the ability of a model to generalize to new cases. Evaluating the predictive performance (AUC) of a set of independent variables using all cases from the original analysis sample tends to result in an overly optimistic estimate of predictive performance. K-fold cross-validation can be used to generate a more realistic estimate of predictive performance. To assess this ability in situations in which the number of observations is not very large, cross-validation and bootstrap strategies are useful. cvAUROC is a community-contributed Stata command that implements k-fold cross-validation for the AUC for a binary outcome after fitting a logistic regression model and provides the cross-validated fitted probabilities for the dependent variable or outcome, contained in a new variable named _fit. Different options and examples for the use of cvAUROC can be downloaded at https://github.com/migariane/cvAUROC and can be directly installed in Stata using ssc install cvAUROC. Additional information: spain18_Miguel Ángel Luque-Fernández(2).pdf Miguel Ángel Luque-Fernández Universidad de Granada, London School of Hygiene and Tropical Medicine, CIBERESP ISCIII Camille Maringe London School of Hygiene and Tropical Medicine
12:45–1:15	The impact of the priority review voucher on research and development investment for neglected diseases Abstract: The priority review voucher (PRV) was implemented in the United States in 2007 with the aim to stimulate research and development (R&D) for neglected diseases. The idea is the following: pharmaceutical companies are granted a priority review voucher by the food and drug administration (FDA) (for example, review within 6 months compared with the standard 10 months) upon successful development of a product (for example, drug or vaccine) for diseases of the PRV list. The voucher either can be used for a blockbuster drug or sold to a third party. The PRV is believed to be a strong consideration among pharmaceutical companies to initiate or continue a project for a neglected disease, with the last one having been granted in June 2018. R&D investment is measured by the number of clinical trials initiated yearly and per disease, which is downloadable from the WHO platform registry. Because the policy targets a specific group of diseases in a specific country (for example, the U.S.), we isolate the impact of the policy through the differences-in-differences (DD) approach and differences-in-differences-in-differences (DDD) approach. Céline Aerts Barcelona Institute for Global Health (ISGlobal) Marisa Miraldo Eliana Barenho Imperial College London Elisa Sicuri Barcelona Institute for Global Health (ISGlobal), Imperial College London
1:15–1:45	Demand for house improvement in rural Gambia Abstract: We estimated the demand for house improvement in rural Gambia, West Africa, by exploring three definitions of demand: utility-derived demand, stated demand, and revealed preferences-based demand. Data were collected in the context of a cluster-randomized controlled trial aiming at identifying and measuring the impact of improved houses on selected health outcomes. We collected panel data (4 rounds over approximately 1 year to control for seasonality) from nearly 200 households representing intervention, control and nonstudy groups, from a random subsample of 15 study villages. We collected information on satisfaction with owned houses (utility), willingness to pay for house improvement (stated preferences), and routine housing behavior (revealed preferences). We estimated the determinants of demand through ordered logit or linear (depending on the outcome variable distribution) fixed-effects models. Under the hypothesis that housing investment choices in such a rural context (and considering the short term) aim at maintaining utility constant across seasons, we plotted predicted demand from the estimated models against time (rounds) and analyzed and interpreted differences across the three definitions of demand. Additional information: spain18_Sicuri.pdf Elisa Sicuri Barcelona Institute for Global Health (ISGlobal), Imperial College London Lesong Conteh Barcelona Institute for Global Health (ISGlobal)
2:45–4:00	Estimating and interpreting effects for nonlinear and nonparametric models Abstract: After we fit a model, our analysis does not stop. We want to use our results to construct counterfactual scenarios. We want to study the effects of changes in variables over the population or for a specific subpopulation. Answering such questions is more challenging for nonlinear models and, in particular, for models in which we make no assumptions about functional forms—nonparametric models. In this presentation, we will illustrate how to answer these and other relevant empirical questions for nonlinear cross-sectional and panel-data models and for nonparametric models. We do this within a unified framework using Stata. Additional information: spain18_Pinzón.pdf Enrique Pinzón StataCorp
4:00–4:30	Propensity-score matching with clustered data in Stata Abstract: In observational studies, estimation of causal effects often relies on the assumption that all relevant confounders are observed. Under this assumption, propensity-score matching (PSM) can be used to adjust for observed confounders. PSM is a semiparametric alternative to regression models that consists of two steps: 1) estimation of the probability of receiving the treatment (propensity score); 2) matching on the estimated propensity score. PSM has been originally proposed for unstructured data, and available Stata routines are designed for these types of data. However, clustered or hierarchical data are common in many fields of study (for example, students nested into school, voters into parties, patients into hospitals). Building on recent methodological developments, the goal of this presentation is to show how PSM can be implemented with clustered data in Stata. Using examples on real data, I will present methods that exploit the information on the clustered structure of the data in two ways: in the estimation of the propensity-score model (through the inclusion of fixed or random effects) or in the implementation of the matching algorithm. Additional information: spain18_Arpino.pdf Bruno Arpino Universitat Pompeu Fabra
4:45–5:15	Exercises on the Internet for researchers and students to learn Stata Abstract: Since the release of Stata 15, it has been possible to convert the results of analyses into .doc (putdocx), .pdf (putpdf), and .html (dyndoc) files. This presentation demonstrates the process by which this is achieved to create a set of basic exercises online (http://bit.ly/Analisis2018), so researchers and students can learn how to manage Stata. First, I discuss the varied file types and how to work with them. Then, I present the steps necessary for obtaining basic analysis with the program, including percentage tables, means, and regressions. In addition to this option, Stata's dyndoc command can generate other web pages unrelated to the program, with minimal knowledge of the HTML language. Additional information: spain18_Escobar.pdf Modesto Escobar Universidad de Salamanca
5:15–5:45	Graphical and numerical solutions to standard research problems in the social sciences: Some suggestions and unresolved challanges Abstract: The goal of this presentation is to identify some common analytical problems that are often encountered in quantitative research in a wide array of social science applications (and possibly in other research fields as well), such as the analysis of multicolinearity of independent variables when qualitative variables are involved; the elaboration of three-way contingency tables with percentages; the presentation of predictive margins and frequency distributions of both qualitative and quantitative variables; the presentation of information both on predictive margins and on contrasts of the statistical significance of the differences of the effects of adjacent and non-adjanent categories of qualitative independent variables; and the construction of time-series graphs based on the frequency distribution of categorical variables. I will put forward some solutions with Stata for discussion among the audience and identify some unresolved challenges. Additional information: spain18_Rama.ppsx José Rama Andrés Santana Universidad Autónoma de Madrid
5:45–6:15	Does interview length affect panel attrition? Abstract: Panel attrition is a threat for data quality in longitudinal studies, especially if those who drop from the study are different from the panel respondents. This presentation investigates the effect of survey length on wave nonresponse using data from Understanding Society, the United Kingdom Household Longitudinal Study (UKHLS). The concept of survey length is addressed from a theoretical point of view, and two measures, length and interview pace, are computed to test their effect on survey cooperation. Pablo Cabrera-Álvarez David Dóncel Abad Universidad de Salamanca
6:15–6:45	A new proposal for the comparative analysis toward linguistic educational policies in multinational settings: An application with the Stata software for the Catalan and Basque cases Abstract: The goal of this presentation is to put forward a new set of indexes and data analytic strategies for the comparative study of attitudes toward linguistic educational policies in multinational settings. These indexes deal with the attitudes toward the linguistic mix in primary and secondary education, most notably regarding the local-international dimension (regional and state-wide ones vis-à-vis English) and the subnational-national one. Empirical analysis will be performed with Stata using data of a specialized survey for the Catalan case (N > 2,200) and the Eusko-barometer of May 2018 (N > 600). Several analytical options will be presented for discussion. Additional information: spain18_Santana.ppsx Andrés Santana Universidad Autónoma de Madrid
6:45–7:15	Wishes and grumbles Abstract: Stata developers present will carefully and cautiously consider wishes and grumbles from Stata users in the audience. Questions, and possibly answers, may concern reports of present bugs and limitations or requests for new features in future releases of the software. StataCorp personnel StataCorp

Scientific committee

Economía:
Dr. Andre Groger
Universitat Autónoma de Barcelona y Barcelona GSE

Sociología y CC Políticas:
Dr. Modesto Escobar
Dpto. Sociología y Comunicación, Universidad de Salamanca

Dr. Mariano Torcal
Ciencias políticas y sociales, RECSM y Universitat Pompeu Fabra

Medicina:
Dr. Sergi Sanz
Dpto. Unidad de bioestadística y gestión de datos, IS Global y Universitat de Barcelona

Dr. Llorenç Quintó
IS Global

Logistics organizer

The logistics organizer for the 2018 Spanish Stata Conference is Timberlake Consulting S.L.,
the distributor of Stata in Spain.

View the proceedings of previous Stata Users Group meetings.

Products

New in Stata 19

Why Stata

All features

Disciplines

Stata/MP

StataNow

Order Stata

Purchase

Order Stata

Bookstore

Stata Press

Stata Journal

Gift Shop

Learn

Free webinars

NetCourses

Classroom and web training

Organizational training

Video tutorials

Third-party courses

Web resources

Teaching with Stata

Support

Training

Video tutorials

FAQs

Statalist: The Stata Forum

Resources

Technical support

Customer service

Alerts

Company

Contact us

News and events

Customer service

Careers

Search

×

We use cookies

We use cookies to ensure that we give you the best experience on our website—to enhance site navigation, to analyze usage, and to assist in our marketing efforts. By continuing to use our site, you consent to the storing of cookies on your device and agree to delivery of content, including web fonts and JavaScript, from third party web services.

Cookie Settings

Privacy policy

Last updated: 16 November 2022

StataCorp LLC (StataCorp) strives to provide our users with exceptional products and services. To do so, we must collect personal information from you. This information is necessary to conduct business with our existing and potential customers. We collect and use this information only where we may legally do so. This policy explains what personal information we collect, how we use it, and what rights you have to that information.

Required cookies

Advertising cookies

Required cookies

These cookies are essential for our website to function and do not store any personally identifiable information. These cookies cannot be disabled.
Advertising and performance cookies

This website uses cookies to provide you with a better user experience. A cookie is a small piece of data our website stores on a site visitor's hard drive and accesses each time you visit so we can improve your access to our site, better understand how you use our site, and serve you content that may be of interest to you. For instance, we store a cookie when you log in to our shopping cart so that we can maintain your shopping cart should you not complete checkout. These cookies do not directly store your personal information, but they do support the ability to uniquely identify your internet browser and device.

Please note: Clearing your browser cookies at any time will undo preferences saved here. The option selected here will apply only to the device you are currently using.

Accept Cookies