Home  /  Resources & support  /  User Group meetings  /  2009 Mexican Stata Users Group meeting

Last updated: 9 June 2009

2009 Mexican Stata Users Group meeting

23 April 2009


Universidad Iberoamericana, Mexico City campus
Prolongación Paseo de la Reforma 880
Lomas de Santa Fe, México C.P. 01219
Distrito Federal, México


Decomposition of the Gini coefficient using Stata

Alejandro López Feldman
Economics Department, Universidad de Guanajuato
The Gini coefficient is widely used to measure inequality in the distribution of income, consumption, and other welfare proxies. Decomposing this measure can help you understand the determinants of inequality. In this presentation, I will use income data from Mexico to illustrate a user-written command, descogini, that implements the Gini decomposition proposed by Lerman and Yitzhaki (1985, Review of Economics and Statistics 67: 151–156). Using this command, the Gini coefficient for total income can be decomposed in three terms: how important the income source is with respect to total income; how equally or unequally distributed the income source is; and how the income source and the distribution of total income are correlated. In the presentation, I will also illustrate how to obtain the impact that a marginal change in a particular income source will have on total income inequality, as well as how to obtain bootstrap standard errors.

Additional information

Stata in the measurement and analysis of poverty in Mexico

Héctor H. Sandoval
Rodrigo Aranda Balcazar
Martín Lima
Consejo Nacional de Evaluación de la Política de Desarrollo Social
Following the General Law of Social Development, the National Council of Evaluation of Social Development Policy (the acronym for its name in Spanish is CONEVAL) has the responsibility to establish the criteria to define, identify, and measure poverty in Mexico. To develop this assignment, CONEVAL primarily uses the information from censuses and surveys carried out by the National Institute of Statistics (INEGI). This type of data usually requires the intensive use of statistical software, which facilitates its analysis; in this way, Stata is the prime tool to elaborate work on poverty. Among the principal products that CONEVAL has presented using Stata as a platform are: 1) an income poverty measure from 1992 to 2006, 2) an estimation of the Social Gap Index 2005, and 3) all the data at the Executive Report of Poverty, Mexico 2007. As well, the versatility of Stata allowed us to process all the census data to develop Income Poverty Maps 2000–2005. The primary objective of this presentation is to exemplify how we have used Stata to estimate and analyze poverty in CONEVAL’s publications.

Additional information

A review of Stata SVAR modeling capabilities

Armando Sánchez Vargas
Institute for Economic Research, UNAM
In this presentation, I will discuss Stata’s capability to implement the entire SVAR methodology with nonstationary series. In the presence of cointegration, the structuralization of a VAR model takes place at two distinct stages: the first is the identification of the long-run equilibrium relationships, and the second stage is the identification of the short-run interactions. I will briefly discuss such methodology and the available facilities in Stata to carry it out, emphasizing what is still needed and what might be refined.

Additional information

Multilevel modeling of ordinal responses

Sophia Rabe-Hesketh
University of California–Berkeley
Ordered categorical responses can be analyzed with different kinds of logistic regression models, the most popular being the cumulative logit or proportional odds model. Alternatively, ordinal probit models can be specified. When the data have a nested structure, with repeated observations for the same individual (as in longitudinal or panel data), or students nested in schools, these models can be extended by including random effects. I will describe the models and show how they can be estimated using gllamm. I will mention some elaborations of the models such as nonproportional odds and heteroskedastic errors. Finally, I will discuss how to obtain different types of predicted probabilities for these models to assess model fit, to visualize the model graphically, and to make inferences for individual units.

Additional information

Dealing with the cryptic survey: Processing labels and value labels with Mata

Alfonso Miranda
Institute of Education, University of London
Survey data comes often as a plain table containing cryptic variable names, numbers, and letters. To make sense of the data, the researcher is given a questionnaire or a code book that contains a list of variable names, their description, and an interpretation of the values (either a number or a string) that each variable can take. Code books are commonly provided as plain text or in PDF format. Hence, the researcher is left “free” to type labels and value labels one by one. This often leads to bad research habits, such as “cutting” and “processing” the piece of survey the researcher needs in the short-run and leaving the rest for future processing. Obviously, this is boring, time consuming, and eventually leads to the creation of various versions of the same survey, an inability to track important changes, and an incapacity to reproduce research results—because the researcher cannot recreate the analyzed dataset step by step from the original source. In this talk, I will discuss how to recover the information that is contained in questionnaires or code books and how to process this information in a clean, fast, and efficient way with Mata.

Additional information

Some improved Stata ado-files for nonparametric smoothing procedures

Isaías H. Salgado Ugarte
FES Zaragoza, UNAM
In this talk, I introduce some improved programs for nonparametric smoothing that originally were written in a very simple manner. These updated ado-files are simple too, but they are more versatile and more “Stata-like” than the original versions. The ado-files include, for density traces, boxdent (boxcar weight function) and dentrace (boxcar and cosine weight functions); for choosing the smoothing parameter in density-frequency estimation, bandw (which permits kernel specification with automatic bandwidth adjustment); for direct and discretized variable bandwidth density estimation, varwiker and varwike2, respectively; for finding critical bandwidth for a specified number of modes, critiband; and for nonparametric assessment of multimodality, bootsamb (to use in conjunction with the boot command). In spite of its simplicity, this collection of commands has proved to be very useful in the analysis of biological (and other kinds of) data, saving the analyst considerabe amounts of time and effort.

Additional information

Cointegrating VAR models and probability forecasting in Stata

Gustavo Sánchez Bizot
Senior Statistician, StataCorp
I discuss two applications of the vec commands in this presentation. First, I use the cointegrating VAR approach discussed in Garrat et al. (2006, Global and National Macroeconometric Modelling: A Long-run Structural Approach) to fit a vector-error correction model. In contrast with the application of the traditional Johansen statistical restrictions for the identification of the coefficients of the cointegrating vectors, I use Stata to show an alternative specification of those restrictions based on the approach by Garrat et al. Second, I apply probability forecasting to simulate probability distributions for the forecasted periods. This approach produces probabilities for future single and joint events, instead of only producing point forecasts and confidence intervals. For example, we could estimate the joint probability of two-digit inflation combined with a decrease in the GDP.

Additional information

Determinants and consequences of property tax collection in Mexico

Daniel Broid
Secreataría de Hacienda y Crédito Público (SHCP)
In this presentation, I will investigate the determinants of property tax collection in Mexico. The tax is paid by all owners of land and dwellings in Mexico for the right of holding their properties, and is collected and managed by municipal authorities at the local level. This type of tax has attractive economic features such as efficiency, progressiveness, and good capacity to finance local public goods. However, the amount of public funds that are raised through the collection of this tax are extremely low. This presentation will describe the main results of the study and show how Stata was used to perform the analysis.

Additional information

Predicting counterfactual densities with the DFL ado-file: A pertinent constructive critique

Luis Huesca
Economics Department, CIAD
It seems that the user-written dfl command has a problem when using micro-unit data without weighting, because its estimates of densities integrate to more than one. This situation produces densities that need to be corrected before a proper empirical analysis can be carried out. In this presentation I will suggest a way of rescaling the outcome variables by applying weights to densities before the kernels are estimated using the Jenkins and Van Kerm (2005, Journal of Economic Inequality 3: 43–61) technique. I present an example of earnings in the Mexican labor market by subgroup population shares, and show that the probability density function decomposition approach is more accurate once the estimates of densities do not exceed the value of one.

Additional information

Analysis of micro data from ENIGH using Stata

Juan Francisco Islas Aguirre
Economics Division, CIDE
Surveys such as the Encuesta Nacional de Ingresos y Gastos de los Hogares (ENIGH, or the National Survey of Household Income and Expenditure) offer many opportunities for the design, estimation, and testing of applied models in social science. The ENIGH is also a valuable source of case studies that can be used as real-life examples for teaching and learning. In this presentation, I discuss a series of exercises from the ENIGH that are used for teaching statistics and econometrics with Stata. I emphasize how to use the facilities of Stata as a learning tool.

Additional information

Reproducible research: Weaving with Stata and StatWeave

Bill Rising
Director of Educational Services, StataCorp
Reproducible research is one of many names for the same concept: writing one report document that contains both the report and the commands from a statistical or programming language needed to produce the results and graphics contained in the report. It is called reproducible research because any interested researcher can then reproduce another’s entire report verbatim. (Programmers call this same concept literate programming.) The utility of reproducible research documents extends far beyond research or programming. They allow rapid updates should there be additional data. They can also be used in teaching for generating differing examples or test questions, because different parameters will generate different examples. In this presentation, I will show you how to use a third-party application to embed Stata code, as well as its output, in either LaTeX or OpenOffice documents. I will also use example documents (including the talk itself) to show how you can update a report, its results, and its graphics by using new data or changing parameters.

Additional information

Scientific organizers

Alfonso Miranda, University of London

Isaías H. Salgado Ugarte, UNAM

Logistics organizers

MultiON Consulting, the official distributor of Stata in Mexico.