>> Home >> Resources & support >> Users Group meetings >> 2010 German Stata Users Group meeting >> Abstracts

2010 German Stata Users Group meeting: Abstracts

Biometrical modeling of twin and family data in Stata

Sophia Rabe-Hesketh
University of California–Berkeley
Data on twins or on other types of family structures (for example, nuclear families, siblings, cousins) can be used to estimate the proportion of variability in observed traits (or phenotypes) that is due to genes. The models are essentially multivariate regression models with residual covariance structures dictated by Mendelian genetics. Usually, specialized software for structural equation modeling is used. However, the required covariance structures can also be produced using mixed models and by specifying an appropriate design matrix for the random part of the model. Stata’s xtmixed command can then be used to estimate the models. For binary phenotypes, such as diabetes, the appropriate probit models can be estimated using gllamm.

Additional information

An introduction to matching methods for causal inference and their implementation in Stata

Barbara Sianesi
Institute for Fiscal Studies
Matching, especially in its propensity-score flavors, has become an extremely popular evaluation method. Matching is, in fact, the best-available method for selecting a matched (or reweighted) comparison group that looks like the treatment group of interest.

In this talk, I will introduce matching methods within the general problem of causal inference, highlight their strengths and weaknesses, and offer a brief overview of different matching estimators. Using psmatch2, I will then step through a practical example in Stata that is based on real data. I will then show how to implement some of these estimators, as well as highlight a number of implementational issues.

Additional information

Heterogeneous treatment-effect analysis

Benn Jann
ETH Zürich
Methods for causal inference and the estimation of treatment effects have received much attention in recent years. Most of the methodological and applied work focuses on the identification of so-called average treatment effects, possibly restricted to the treated or the untreated. However, treatment effects may vary (hence the averaging), and it can be interesting to analyze the patterns of effect heterogeneity. In this talk, I will present a new command called hte that is used for heterogeneous treatment-effect analysis in Stata. hte first constructs balanced propensity-score strata and, within each stratum, estimates the average treatment effect. hte then tests for a linear trend in effects across the strata. The stratum-specific treatment effects and the estimated linear trend are displayed in a two-way graph. hte results from joint work with Jennie E. Brand (UCLA) and Yu Xie (University of Michigan).

Additional information

Estimation of linear fixed-effects models with individual-specific slopes in Stata

Volker Ludwig
Mannheim Center for European Social Research (MZES)
Fixed-effects regression is considered a powerful method for estimating causal effects with survey data. However, in the linear model, the conventional technique of time-demeaning does not yield consistent estimates of the parameters when unobserved heterogeneity is not time-constant. Jeffrey M. Wooldridge (2002, Econometric Analysis of Cross Section and Panel Data [MIT Press], 317–322) derived a general model for the situation where unobserved and observed characteristics of individuals interact to produce the outcome. The fixed-effects model with individual constants and slopes (FEIS) is a remedy for coefficients that are biased due to, for example, maturation or learning where unobserved traits affect individual growth curves differently for treated and controls.

The Stata xtfeis command implements the FEIS estimator in Mata, allowing for individual constants and (potentially many) slopes. Without specifying slope variables, the model collapses to the conventional model estimated by xtreg, fe that accounts for individual constants only. xtfeis implements standard errors that are robust to serial correlation or heteroskedasticity of unknown form. Estimates of the slope parameters are available optionally. The command requires panel data with at least J + 1 observations per unit, where J is the number of individual-specific slope variables (usually, but not necessarily, also including the individual-specific constant). I will present results for the effect of marriage on male wages based on real data (GSOEP and NLSY) to demonstrate the practical relevance of the method. I will use simulation results to assess robustness of the estimator to autocorrelation, measurement error, and misspecification of functional form.

Additional information

Generalized method of moments estimators in Stata

David Drukker
StataCorp LP
Stata 11 has a new command, gmm, for estimating parameters by the generalized method of moments (GMM). gmm can estimate the parameters of linear and nonlinear models for cross-sectional, panel, and time-series data. In this presentation, I provide an introduction to GMM and to the gmm command.

Additional information

Analyzing proportions

Maarten Buis
University of Tübingen
In this talk, I will discuss some techniques available in Stata for analyzing dependent variables that are proportions. I will discuss four programs: betafit, glm, dirifit, and fmlogit. The first two deal with situations where we want to explain only one proportion, while the latter two deal with situations where we have for each observation multiple proportions that must add up to one. I will focus on how to interpret the results of these models and on the relative strengths and weaknesses of these models.

Additional information

User-written Stata program: agrm

Alejandro Ecker
University of Mannheim
In the context of his research on perceptual agreement, Cees van der Eijk (2001, Quality & Quantity: 35, 325–341) indicates that empirical measures that resort to the standard deviation of the response distribution capture not only consensus but also skewedness. Thus they are inappropriate measures of agreement. His alternative measure of agreement, A, circumvents this problem and yields unbiased figures for all kinds of ordered rating scales. It first decomposes the frequency distribution into constituent layers, that is, row vectors for which consensus can be unambiguously defined. It then computes the weighted average degree of agreement. Given the lack of a corresponding ado-file, the user-written agrm command allows you to directly calculate van der Eijk’s index of agreement, A, in Stata. Aside from a broad range of basic programming features such as low-level parsing and specifying additional program options, argm also entails more advanced techniques such as handling empty categories and handling numerical missing values. Moreover, it highlights the potential of nested loops and local macros in the context of multiple permutations. Finally, the agrm command is especially suited for showing how Stata’s matrix language, Mata, provides a powerful environment for handling vectors and matrices.

Additional information

Yet another program to create publication-quality tables

Tamás Bartus
Institute of Sociology and Social Policy, Corvinus University
Stata users have developed several programs to create publication-quality documents containing regression results (outreg, outreg2, outtex, estout), tables of statistics (tabout), and contents of matrices (outtable). So far, less effort has been made to enable the easy publication of other kinds of tables, such as those displaying the definitions of variables and summary statistics. Although the sophisticated estout package can create tables other than regression results, the underlying mechanism of posting results as if they were estimation results has limitations, and removing these limitations should involve additional programming.

The user-written command publish (working title) is intended for users with limited knowledge in programming. It creates publication-quality documents (HTML, MS Word, or LaTeX) that may consist of tables displaying the following elements: definitions of variables, codebooks, summary statistics, one-way and two-way frequencies, various statistics, or estimation results. Users can create large tables where results are separately shown for various subsamples or for several cross-tabulations with a common dependent variable. Users can combine different sorts of elementary tables. Users can also publish matrices of part of the data in memory and create empty tables into which results from other tables can be pasted. Controlling the layout of the table and the column titles and supercolumn titles is also easily done using a small number of common options.

RDS—a Stata program for respondent-driven sampling

Matthias Schonlau
DIW and Rand Corporation
Elisabeth Liebau
Respondent-driven sampling (RDS) is a sampling technique typically employed for hard-to-reach populations (for example, homeless people, people with AIDS, immigrants). Briefly, initial seed respondents recruit additional respondents from their network of friends. The recruiting process repeats iteratively, thereby forming long referral chains. It is crucial to obtain estimates of respondents’ network sizes (for example, the number of friends with the characteristic of interest). RDS shares some similarities with snowball sampling, but the theoretical foundation for inference using RDS samples is much stronger. We will give a brief overview of this technique and introduce a new user-written Stata command for RDS.

Additional information

Report to the users

Bill Gould
StataCorp LP
Bill Gould, president of StataCorp and head of development, talks about Stata.





The Stata Blog: Not Elsewhere Classified Find us on Facebook Follow us on Twitter LinkedIn Google+ YouTube
© Copyright 1996–2017 StataCorp LLC   •   Terms of use   •   Privacy   •   Contact us