Home  /  Resources & support  /  User Group meetings  /  2010 Mexican Stata Users Group meeting

Last updated: 24 May 2010

2010 Mexican Stata Users Group meeting

29 April 2009


Universidad Iberoamericana, Mexico City campus
Prolongación Paseo de la Reforma 880
Lomas de Santa Fe, México C.P. 01219
Distrito Federal, México

Proceedings | Español

Estimation of treatment effects for social program evaluation

Omar Stabridis
Janet Zamudio
Mario Paulín
Consejo Nacional de Evaluación de la Política de Desarrollo Social
Because impact evaluation is an important tool that guides public policy decisions and because applying impact evaluation is a rigorous process, we must generate examples of how impact evaluation methodologies apply to the Mexican context. To this end, we have used nonexperimental methodologies to estimate treatment effects for Mexican social programs.

In order to quantify the effects that a social program has on its beneficiaries’ welfare and productive activities, we have used Stata to estimate treatment effects and to generate an adequate database with information from the Mexican Family Life Survey for the years 2002 and 2005.

Two central Stata commands were used: pscore and psmatch2. The pscore command estimates the propensity score and stratifies individuals according to the propensity-score distribution, using for this a series of covariables that are assumed to be related to both treatment status and the result variable. This command also checks that the balancing property is satisfied. psmatch2 performs a variety of matching estimation methods to obtain estimates of the average treatment effect on the treated. Additionally, we used database handling commands such as foreach, merge, collapse, gen, egen, recode, and replace.

Our panel will discuss the advantages and disadvantages of these commands when applied to the evaluation of social programs.

Stata as a tool for transparency and statistics dissemination: Measuring multidimensional poverty in Mexico

Víctor H. Pérez
Dulce Cano
Rocío Espinosa
Consejo Nacional de Evaluación de la Política de Desarrollo Social
In 2009, CONEVAL (Consejo Nacional de Evaluación de la Política de Desarrollo Social) presented the official methodology to measure multidimensional poverty in Mexico, which is a set of intuitive indicators that measure income and social rights deprivation, taking into account the territorial context. To follow the principles of technical rigor, transparency, and impartiality, CONEVAL decided to publish all the necessary elements to reproduce its multidimensional poverty measures, including (a) adopted methodology, (b) databases, and (c) Stata and SPSS programs used for generating the indexes. In this presentation, we will show how Stata and SPSS were used to produce the Mexican multidimensional poverty measures as well as the process to “equalize” both programs.

Additional information

Measuring poverty at state level using Stata

Carlos Guerrero de Lizardi
Manuel Lara Caballero
Instituto Tecnológico de Estudios Superiores de Monterrey
Using the approach proposed in 2002 by the Technical Committee for the Measurement of Poverty (TCMP), CONEVAL produced a set of state-level poverty measures. The methodology consists of comparing a food basket that contains the minimum consumption requirements with a household’s average income. Data from the National Survey of Household Income and Expenditure (ENIGH) are used in the calculations. Currently, state poverty measures are calculated using the National Consumer Price Index (NCPI) published by BANXICO as a unique deflator. Hence, a major pending issue is correcting for regional differences in the cost of living. In this talk, we will describe a set of do-files that implement such a correction and will underline the main methodological and policy implications behind the correction.

Additional information

Hierarchical linear models using Stata

Delfino Vargas Chanes
Colegio de México
Maria Merino
Some surveys collect data of individuals who are nested within hierarchical organizations or countries. These data are useful, for instance, for ranking countries according to a major outcome adjusted for covariates. Reporting only means produces rankings that are biased. So it is necessary to incorporate covariates and acknowledge the hierarchical structure of the data. From the perspective of ordinary regression, such structuring constitutes a statistical problem because it violates the assumption that observations are independent and identically distributed. In such a context, a hierarchical, or multilevel, linear model can be fit so that the hierarchical nature of the data is explicitly modeled. In this presentation, we will briefly discuss the strengths and limitations of hierarchical models for ranking countries.

Additional information

Generating descriptive statistics from the MXFLS

Alicia Santana Cartas
Universidad Iberoamericana
In this presentation, I aim to show how to produce informative descriptive statistics from a longitudinal survey using the Mexican Family Life Survey (MXFLS) as an example. I will introduce the audience to the MXFLS and discuss its main innovative features, such as the sample design, the attitudes toward the risk module, and the migration module (including the monitoring and rate of recontact). Then I will show how to tabulate the data in an informative way and how to produce descriptive statistics using the provided survey weights.

Additional information

Keynote lecture: Estimation of count-data panel models

Pravin K. Trivedi
Indiana University
In this talk, I will cover a number of topics related to the estimation of panel models for count data, with empirical illustrations estimated using Stata. For the theoretical background, I will rely on my book with Colin Cameron, Microeconometrics: Methods and Applications (2005, Cambridge University Press). Some of my illustrations will be based on material in my recent book with Colin Cameron, Microeconometrics Using Stata (2009, Stata Press), but several others will be based on as yet unpublished material. This talk will be operational in orientation and, for specificity, I will rely on examples estimated in Stata. I plan to cover the following topics:
  • nonlinear panel-data modeling for exponential mean models
  • fixed- and random-effects panel models for the Poisson and negative binomial regression
  • nonlinear GMM estimation of Poisson panel regression with sample selection or endogenous regressors
  • dynamic panel Poisson regression with correlated random effects
  • dynamic panel Poisson regression with linear feedback
  • finite mixture models for panel Poisson regression

Additional information

Bivariate dynamic probit models for panel data

Alfonso Miranda
Institute of Education, University of London
In this talk, I will discuss the main methodological features of the bivariate dynamic probit model for panel data. I will present an example using simulated data, giving special emphasis to the initial conditions problem in dynamic models and the difference between true and spurious state dependence. The model is fit by maximum simulated likelihood.

Additional information

Selection-bias correction based on the multinomial logit: An application to the Mexican labor market

Luis Huesca
Mario Camberos
Economics Department, Centro de Investigación en Alimentación y Desarrollo
In this presentation, we illustrate an application of a relatively new selection-bias correction methodology based on the multinomial logit model using the selmlog Stata command (Bourguignon, Fournier, and Gurgand, 2007, Journal of Economic Surveys 21: 174–205). selmlog allows for getting both consistent and efficient estimates of the selection process and a fairly good correction for the outcome equation, even when the independence of irrelevant alternatives (IIA) assumption is not achieved. The exercise depicts the current pattern of the occupational choices for the individuals in the Mexican labor market using a longitudinal panel with microdata from the Encuesta Nacional de Ocupación y Empleo (ENOE) during February 2008 to March 2009. We estimate an equation over an endogenously selected population. The command grants simplicity for both distributional and IIA assumptions for parametric models.

Additional information

Generalized method of moments estimators in Stata

David Drukker
StataCorp LP
Stata 11 has the new command gmm for estimating parameters by generalized method of moments (GMM). gmm can estimate the parameters of linear and nonlinear models for cross-sectional, panel, and time-series data. In this presentation, I provide an introduction to GMM and to the gmm command.

Additional information

Using Stata to analyze size frequency of the life cycle of a Mexican desert spider

Irma Gisela Nieto-Castañeda
María Luisa Jiménez-Jiménez
Isaías H. Salgado-Ugarte
Centro de Investigaciones Biológicas del Noroeste, S.C. y FES Zaragoza UNAM
In biology, the study of the life cycle of plants and animals helps one to understand the phenology of a particular species, which is useful in pest management or in biological conservation. Spiders are one of the most widespread animals on earth. They eat a huge variety of other animals and are good indicators of environmental changes. We studied for the first time the life cycle of an endemic desert spider (Syspira tigrina). Many spider researchers have used the direct estimation of the number of instars to describe the arachnid life cycle. Other methods are based on the analysis of the length-frequency throughout time (indirect methods). Length-frequency distributions are commonly analyzed by histograms. However, this procedure depends on grid origin, and the interval width is discontinuous and uses a fixed interval width. These problems have motivated the interest of statisticians in alternative, more computationally intensive methods. Kernel density estimators (KDEs) do not depend on the origin position and are continuous distribution estimators. In addition, there are several methods for choosing the interval width. In this study, we present in Stata the use of KDEs to examine length-frequency distributions of spider size in combination with the traditional approach using histograms.

Additional information

ML modeling capabilities: Stata vs Gauss

Armando Sánchez Vargas
Institute for Economic Research, UNAM
The main purpose of this work is to discuss Stata’s capability to implement customized likelihood functions compared with Gauss’s. I compare these two high-level programming languages with built-in function libraries and graphic routines. Overall, Stata’s features seem best suited for analyzing specific models of decision-making processes and other microeconometric applications, while Gauss is ideal for analyzing a more ample range of statistical issues based on maximum likelihood estimation. I briefly discuss such modeling capabilities, emphasizing what is still needed and what might be refined.

Additional information

Analyzing data from complex survey designs

Isabel Cañette
StataCorp LP
This presentation is a tutorial on how to analyze complex survey data in Stata. I will start by reviewing the sampling methods most frequently used for survey data and examining why a special treatment is needed to perform estimations using these data. I will discuss the concepts of stratification, clustering, sampling weights, and finite population correction, and illustrate how to account for them by using the svyset command. Once the declaration on svyset has been done, estimations can be performed by simply adding the svy prefix to a Stata command; I will show some examples.

I will also discuss the variance estimators implemented in Stata for survey data: linearized, jackknife, and balanced repeated replications.

Finally, I will also explain how Stata deals with subpopulation estimation, and I will explain the use of poststratification.

Additional information

Scientific organizers

Alfonso Miranda, (chair) University of London,

Landy Sanchez Peña, Colegio de México

Logistics organizers

MultiON Consulting, the official distributor of Stata in Mexico.