Alejandro López Feldman

Economics Department, Universidad de Guanajuato

The Gini coefficient is widely used to measure inequality in the
distribution of income, consumption, and other welfare proxies. Decomposing
this measure can help you understand the determinants of inequality. In this
presentation, I will use income data from Mexico to illustrate a
user-written command, **descogini**, that implements the Gini
decomposition proposed by Lerman and Yitzhaki (1985, *Review of Economics
and Statistics* 67: 151–156). Using this command, the Gini
coefficient for total income can be decomposed in three terms: how important
the income source is with respect to total income; how equally or unequally
distributed the income source is; and how the income source and the
distribution of total income are correlated. In the presentation, I will
also illustrate how to obtain the impact that a marginal change in a
particular income source will have on total income inequality, as well as
how to obtain bootstrap standard errors.

**Additional information**

mex09sug_alf.pdf

mex09sug_alf.pdf

Héctor H. Sandoval

Rodrigo Aranda Balcazar

Martín Lima

Consejo Nacional de Evaluación de la Política de Desarrollo Social

Following the General Law of Social Development, the National Council of
Evaluation of Social Development Policy (the acronym for its name in Spanish
is CONEVAL) has the responsibility to establish the criteria to define,
identify, and measure poverty in Mexico. To develop this assignment, CONEVAL
primarily uses the information from censuses and surveys carried out by the
National Institute of Statistics (INEGI). This type of data usually requires
the intensive use of statistical software, which facilitates its analysis;
in this way, Stata is the prime tool to elaborate work on poverty. Among the
principal products that CONEVAL has presented using Stata as a platform are:
1) an income poverty measure from 1992 to 2006, 2) an estimation of the
Social Gap Index 2005, and 3) all the data at the Executive Report of
Poverty, Mexico 2007. As well, the versatility of Stata allowed us to
process all the census data to develop Income Poverty Maps 2000–2005.
The primary objective of this presentation is to exemplify how we have used
Stata to estimate and analyze poverty in CONEVAL’s publications.

**Additional information**

mex09sug_hs.pptx

mex09sug_hs.pptx

Armando Sánchez Vargas

Institute for Economic Research, UNAM

In this presentation, I will discuss Stata’s capability to implement
the entire SVAR methodology with nonstationary series. In the presence of
cointegration, the structuralization of a VAR model takes place at two
distinct stages: the first is the identification of the long-run equilibrium
relationships, and the second stage is the identification of the short-run
interactions. I will briefly discuss such methodology and the available
facilities in Stata to carry it out, emphasizing what is still needed and
what might be refined.

**Additional information**

mex09sug_asv.ppt

mex09sug_asv.ppt

Sophia Rabe-Hesketh

University of California–Berkeley

Ordered categorical responses can be analyzed with different kinds of
logistic regression models, the most popular being the cumulative logit or
proportional odds model. Alternatively, ordinal probit models can be
specified. When the data have a nested structure, with repeated
observations for the same individual (as in longitudinal or panel data), or
students nested in schools, these models can be extended by including random
effects. I will describe the models and show how they can be estimated using
**gllamm**. I will mention some elaborations of the models such as
nonproportional odds and heteroskedastic errors. Finally, I will discuss how
to obtain different types of predicted probabilities for these models to
assess model fit, to visualize the model graphically, and to make inferences
for individual units.

**Additional information**

mex09sug_srh.zip

mex09sug_srh.zip

Alfonso Miranda

Institute of Education, University of London

Survey data comes often as a plain table containing cryptic variable names,
numbers, and letters. To make sense of the data, the researcher is given a
questionnaire or a code book that contains a list of variable names, their
description, and an interpretation of the values (either a number or a
string) that each variable can take. Code books are commonly provided as
plain text or in PDF format. Hence, the researcher is left
“free” to type labels and value labels one by one. This often
leads to bad research habits, such as “cutting” and
“processing” the piece of survey the researcher needs in the
short-run and leaving the rest for future processing. Obviously, this is
boring, time consuming, and eventually leads to the creation of various
versions of the same survey, an inability to track important changes, and an
incapacity to reproduce research results—because the researcher cannot
recreate the analyzed dataset step by step from the original source. In this
talk, I will discuss how to recover the information that is contained in
questionnaires or code books and how to process this information in a clean,
fast, and efficient way with Mata.

**Additional information**

mex09sug_am.pdf

mex09sug_am.pdf

Isaías H. Salgado Ugarte

FES Zaragoza, UNAM

In this talk, I introduce some improved programs for nonparametric smoothing
that originally were written in a very simple manner. These updated
ado-files are simple too, but they are more versatile and more
“Stata-like” than the original versions. The ado-files include,
for density traces, **boxdent** (boxcar weight function) and
**dentrace** (boxcar and cosine weight functions); for choosing the
smoothing parameter in density-frequency estimation, **bandw** (which
permits kernel specification with automatic bandwidth adjustment); for
direct and discretized variable bandwidth density estimation,
**varwiker** and **varwike2**, respectively; for finding critical
bandwidth for a specified number of modes, **critiband**; and for
nonparametric assessment of multimodality, **bootsamb** (to use in
conjunction with the **boot** command). In spite of its simplicity, this
collection of commands has proved to be very useful in the analysis of
biological (and other kinds of) data, saving the analyst considerabe amounts
of time and effort.

**Additional information**

mex09sug_isu.pptx

mex09sug_isu.pptx

Gustavo Sánchez Bizot

Senior Statistician, StataCorp

I discuss two applications of the **vec** commands in this presentation.
First, I use the cointegrating VAR approach discussed in Garrat et al.
(2006, *Global and National Macroeconometric Modelling: A Long-run
Structural Approach*) to fit a vector-error correction model. In contrast
with the application of the traditional Johansen statistical restrictions
for the identification of the coefficients of the cointegrating vectors, I
use Stata to show an alternative specification of those restrictions based
on the approach by Garrat et al. Second, I apply probability forecasting to
simulate probability distributions for the forecasted periods. This approach
produces probabilities for future single and joint events, instead of only
producing point forecasts and confidence intervals. For example, we could
estimate the joint probability of two-digit inflation combined with a
decrease in the GDP.

**Additional information**

mex09sug_gs.ppt

mex09sug_gs.ppt

Daniel Broid

Secreataría de Hacienda y Crédito Público (SHCP)

In this presentation, I will investigate the determinants of property tax
collection in Mexico. The tax is paid by all owners of land and dwellings in
Mexico for the right of holding their properties, and is collected and
managed by municipal authorities at the local level. This type of tax has
attractive economic features such as efficiency, progressiveness, and good
capacity to finance local public goods. However, the amount of public funds
that are raised through the collection of this tax are extremely low. This
presentation will describe the main results of the study and show how Stata was
used to perform the analysis.

**Additional information**

mex09sug_db.pdf

mex09sug_db.pdf

Luis Huesca

Economics Department, CIAD

It seems that the user-written **dfl** command has a problem when using
micro-unit data without weighting, because its estimates of densities
integrate to more than one. This situation produces densities that need to
be corrected before a proper empirical analysis can be carried out. In this
presentation I will suggest a way of rescaling the outcome variables by
applying weights to densities before the kernels are estimated using the
Jenkins and Van Kerm (2005, *Journal of Economic Inequality* 3:
43–61) technique. I present an example of earnings in the Mexican
labor market by subgroup population shares, and show that the probability
density function decomposition approach is more accurate once the estimates
of densities do not exceed the value of one.

**Additional information**

mex09sug_lh.pptx

mex09sug_lh.pptx

Juan Francisco Islas Aguirre

Economics Division, CIDE

Surveys such as the Encuesta Nacional de Ingresos y Gastos de los Hogares
(ENIGH, or the National Survey of Household Income and Expenditure) offer
many opportunities for the design, estimation, and testing of applied models
in social science. The ENIGH is also a valuable source of case studies that
can be used as real-life examples for teaching and learning. In this
presentation, I discuss a series of exercises from the ENIGH that are used
for teaching statistics and econometrics with Stata. I emphasize how to use
the facilities of Stata as a learning tool.

**Additional information**

mex09sug_jfi.pdf

mex09sug_jfi.pdf

Bill Rising

Director of Educational Services, StataCorp

Reproducible research is one of many names for the same concept: writing
one report document that contains both the report and the commands from a
statistical or programming language needed to produce the results and
graphics contained in the report. It is called reproducible research because
any interested researcher can then reproduce another’s entire report
verbatim. (Programmers call this same concept literate programming.) The
utility of reproducible research documents extends far beyond research or
programming. They allow rapid updates should there be additional data. They
can also be used in teaching for generating differing examples or test
questions, because different parameters will generate different examples. In
this presentation, I will show you how to use a third-party application to
embed Stata code, as well as its output, in either LaTeX or OpenOffice
documents. I will also use example documents (including the talk itself) to
show how you can update a report, its results, and its graphics by using new
data or changing parameters.

**Additional information**

mex09sug_br.zip

mex09sug_br.zip