2015 Spanish Stata Users Group meeting 
  
 22 October 2015 
 
  Instituto de Empresa
  Calle de María de Molina, 13
  28006 Madrid
  Spain
Proceedings
 		Revisiting generalized method of moments
 	Enrique Pinzon
 	StataCorp
The generalized method of moments (GMM) estimator, an economist's favorite, was 
introduced in Stata 11. GMM is useful in many other disciplines, however, and 
we have used it extensively in the treatment-effects commands released in 
Stata 13 and Stata 14. I will briefly discuss some relevant properties of GMM 
and then show how it is used in treatment-effects estimation. I will conclude 
with a simple application of GMM that is new in the literature.
  
   Additional information
   spain15_pinzon.pdf
 
          A low CD4/CD8 ratio during effective ART predicts immunosenescence and morbidity/mortality 
 Sergio Serrano-Villar 
  University Hospital Ramón Cajal 
 Santiago Moreno 
  University Hospital Ramón Cajal 
 Talia Sainz 
  University Hospital La Paz 
 April L. Ferre 
  University of California, Davis 
 Sulggi A. Lee 
  University of California, San Francisco 
 Peter W. Hunt 
  University of California, San Francisco 
 Elizabeth Sinclair 
  University of California, San Francisco 
 Vivek Jain 
  University of California, San Francisco 
 Frederick M. Hecht 
  University of California, San Francisco 
 Steven G. Deeks 
  University of California, San Francisco 
A low CD4/CD8 ratio in elderly HIV-uninfected adults is associated with 
increased mortality. A subset of HIV-infected adults receiving effective 
antiretroviral therapy (ART) fails to normalize this ratio, even after 
they achieve normal CD4+ T-cell counts. The immunologic and clinical 
characteristics of this clinical remain undefined. Using data from four 
distinct clinical cohorts, we show that a low CD4/CD8 ratio in HIV-infected 
adults during otherwise effective ART (CD4+ T-cell counts >500 cells/mm3) 
is associated with a number of immunological abnormalities. Longitudinal 
changes in CD4+ and CD8+ T-cell counts and in the CD4/CD8 ratio were assessed 
using linear mixed models with random intercepts. Age, gender, and pre-ART CD4+ 
T-cell count were included in multivariate analyses as fixed effects. Interaction 
terms were created to assess whether these changes over time differed significantly 
between the early and later ART initiators. Changes in slopes before and after ART 
time points were assessed using linear splines. Individuals who initiated ART within 
6 months of infection had greater CD4/CD8 ratio increase compared with later 
initiators (>2 years). Conditional logistic regression analysis showed that a low 
CD4/CD8 ratio predicted higher risk on morbidity and mortality. Hence, this 
clinically accessible measurement may prove useful in monitoring response to ART 
and could identify a unique subset of individuals in need of novel therapeutic interventions.  
  
   Additional information
   spain15_serrano.pdf
 
            Assessing convergent and discriminant validity in the ADHD-R IV rating scale: User-written commands for average variance extracted (AVE), composite reliability (CR), and heterotrait-monotrait ratio of correlations (HTMT)
 David Alarcón Rubio 
  Universidad Pablo de Olavide 
 José Antonio Sánchez Medina  
  Universidad Pablo de Olavide 
Convergent and discriminant validity examines the extent to which a latent 
variable is different from others in a variance-based SEM. The criterion of 
Fornell-Larcker (1981) has been commonly used to assess the degree of shared 
variance between the latent variables of the model. According to this criterion, 
convergent validity can be assessed by composite reliability (CR) and average 
variance extracted (AVE). CR is a less biased estimate of reliability than Chonbach's 
alpha; the acceptable value of CR is 0.7 and above. AVE measures the level of variance 
captured by a construct versus the level due to measurement error; values above 0.7 
are considered very good, whereas a level of 0.5 is acceptable. Discriminant validity 
is assessed by comparing AVE and the squared correlation between two constructs. The 
level of square root of AVE should be greater than the correlations involving the 
constructs. Recently, the heterotrait-monotrait ratio of the correlations (HTMT) 
approach has been proposed to assess discriminant validity. HTMT is the average of 
the heterotrait-heteromethod correlations relative to the average of the 
monotrait-heteromethod correlations. The present work presents a series of user-written 
commands to obtain these indicators of convergent and discriminant validity for 
confirmatory factor-analysis models and to calculate their confidence 
intervals using the bootstrap method. To demonstrate the use of these commands, we use 
data from a sample of high school students who have been administered the ADHD-R IV rating scale.
  
   Additional information
   spain15_alarcon.pdf
 
    Differences in perinatal health among immigrant and native-origin children: Evidence from differentials in weight at birth in Spain 
 Hector Cebolla-Boado 
  Universidad Nacional de Educación Distancia 
 Leire Salazar 
  Universidad Nacional de Educación Distancia 
This presentation explores differences in perinatal inequality between migrants and natives 
in Spain and, more specifically, differences in the weight at birth.
In line with the logic of the "healthy immigrant paradox", the children of immigrant 
mothers are known for having a lower risk of low weight at birth (LBW; <2,500).
Using the universe of births in Spain in 2013 (excluding preterm and multiple births), 
we go beyond the standard approach of using a dichotomous variable for estimating the 
risk of LBW.
Using Stata, we estimate quantile regression to explore migrant-native differentials 
in weight at birth across the range of observed values and also concentrate on the 
impact of migrant status among babies weighing above 4,000 grams, a threshold that, 
similarly to LBW, is associated with certain pathological characteristics and a 
problematic future development.
Our research not only confirms that the well-known epidemiological regularity of 
healthier babies among migrants in advanced democracies also applies to Spain, namely, 
an advantage of immigrant-origin babies in terms of avoiding LBW, but also confirms 
that in the other extreme, when the baby's weight is above 4,000 grams, 
migrant-origin babies weigh over 110 grams more than native-origin ones. In sum, 
we contribute to the literature by showing that the higher average weight of newly 
born babies from immigrant mothers is not always a source of perinatal advantage.
 
             aries: An implementation of CART in Stata 
 Ricardo Mora 
  Universidad Carlos III de Madrid 
Tree-structured models use two-dimensional binary trees as a predictive model. 
Tree models where the target variable can take a finite set of values are called 
classification trees. Decision trees where the target variable can take continuous 
values (typically real numbers) are called regression trees. Estimation of the tree 
is trivial in both classification and in regression trees if the structure of the tree 
is known. Otherwise, several algorithms have been proposed, and several software packages 
implement these algorithms, notably the classification and regression trees (CART) 
algorithm by Breiman et al (1984) (that is, Salford Systems CART, Matlab, and R). In Stata, 
the module cart, developed by Wim van Putten, performs a CART analysis but only for 
failure time data. In this presentation, I discuss a new module, aries, that performs the basic 
CART algorithm for both binary and continuous dependent variables.
  
   Additional information
   spain15_mora.pdf
 
           Stata web services: Toward Stata-based healthcare informatics applications integrated in a service-oriented architecture (SOA)
 Alexander Zlotnik 
  Technical University of Madrid 
  University Hospital Ramón y Cajal 
 Modesto Escobar 
  Universidad de Salamanca 
 Ascensión Gallardo-Antolín 
  Universidad Carlos III de Madrid 
 Juan Manuel Montero Martínez 
  Technical University of Madrid 
Stata has many functions that can be used in decision support systems, forecasting 
systems, and, generally, applications that use analytical or modeling 
functionalities. A web interface with an HTML/JS graphical user interface or an 
XML-based web service are convenient approaches for exposing Stata-based programs 
on public and private computer networks. However, using Stata through a web interface 
or integrating it into a corporate software environment such as a service-oriented 
architecture can be challenging. Usually, Stata-based programs need to be translated 
(reimplemented) in a different programming language to be used through the 
aforementioned interfaces. These reimplementations can be problematic, time consuming, 
and error prone.
We describe an approach for using Stata-based applications directly through a web 
interface, the requirements for such applications, and the limitations of this approach. 
We then discuss modern software engineering solutions for software integration scenarios 
in healthcare informatics and potential use for Stata-based decision support systems 
in this field.
  
   Additional information
   spain15_zlotnik.pdf
 
          Introduction to Markov-switching regression models using the mswitch command
 Gustavo Sánchez 
  StataCorp 
A considerable number of time series can be characterized by data-generating 
processes (DGP) that may be affected by particular events that lead to changes 
in the parameters. The new conditions for the DGP may remain in place for a 
period of time until the change is reversed to the previous state or until a 
new event leads to a new state, with the corresponding change in the parameters. 
In Stata 14, we introduce the 
mswitch command to model those kinds of time series 
by characterizing the transitions between unobserved states with a Markov chain. 
I will briefly introduce the basic concepts of Markov-switching models, and I 
will use a couple of examples to illustrate the implementation provided by 
mswitch.
  
   
Additional information
   spain15_sanchez.pdf
 
   Modeling multilevel data: The estimated dependent variable approach 
 Antonio M. Jaime-Castillo 
  Universidad de Málaga
Multilevel data have become very popular in the social sciences. Several 
international research projects (such as the European Social Survey, the 
International Social Survey Programme, and the World Value Survey) have produced 
a large amount of comparative data in recent decades. The dominant approach to 
analyze multilevel data structures uses multilevel models (a mixture of fixed 
and random effects), and major statistical packages have incorporated routines 
for estimating these kinds of models. This analytical strategy has several 
advantages over most naïve pooling strategies. However, it also has some drawbacks 
on both theoretical and practical grounds. The statistical theory behind multilevel 
models is still under development, and the computational burden to estimate nonlinear 
models, as well as convergence issues, can be challenging in some cases. An 
alternative is the estimated dependent variable (EDV) approach, in which the researcher 
estimates a separate model for individual variables in each level 2 unit in the first 
step. In the second step, the estimated coefficients in the first step become the dependent 
variables to be explained by a set of aggregate predictors. In this presentation, I focus 
on the potential applications of this approach using Stata.
  
   Additional information
   spain15_jaime.pdf
 
   A simple procedure to correct for measurement errors in survey research 
 Anna DeCastellarnau 
  Universitat Pompeu Fabra 
Although there is much literature on the existence of measurement errors, few 
researchers are correcting them in their analyses. In this presentation, I will 
show that correction for measurement errors in survey research is not only necessary 
but also possible and actually rather simple. Using the quality estimates obtained 
from the free online software Survey Quality Predictor (SQP), one can easily correct  
and use correlation and covariance matrices as input for your analysis. This procedure 
was described for Stata, LISREL, and R in the ESS EduNet module "A simple procedure to 
correct for measurement errors in survey research". This presentation will focus on the 
correction of measurement errors in regression analysis and causal models using Stata.
  
   Additional information
   spain15_decastellarnau.pdf
 
          Content analysis with Stata 
 Modesto Escobar 
  Universidad de Salamanca 
 José L. Alonso Berrocal 
  Universidad de Salamanca 
Content analysis is a technique used in the social sciences for the systematic study 
of the contents of the communication. In this presentation, we discuss a couple of useful 
programs for statistical analysis of texts. The first (precoin) splits the 
text into words or groups of words to form an incidence matrix. The second (coin) 
works with this matrix and produces frequencies, co-occurrences, multivariate statistical 
measures of centrality and distance, and various types of graphs. We present, as examples of 
its use, an analysis of a sample of tweets and another analysis of open-ended 
answers from a questionnaire.
  
   Additional information
   spain15_escobar.pdf
 
            Wishes and grumbles
  StataCorp 
  StataCorp staff will be happy to receive wishes for developments in Stata and almost 
  as happy to receive grumbles about the software.  
Scientific organizers
Modesto Escobar, Universidad de Salamanca
Alexander Zlotnik, Polytechnic University of Madrid and Hospital Universitario Ramón Cajal
Logistics organizers
  Timberlake Consulting S.L.,
  the official distributor of Stata in Spain.