2010 German Stata Users Group meeting: Abstracts
Biometrical modeling of twin and family data in Stata
Sophia Rabe-Hesketh
University of California–Berkeley
Data on twins or on other types of family structures (for example, nuclear families,
siblings, cousins) can be used to estimate the proportion of variability
in observed traits (or phenotypes) that is due to genes. The
models are essentially multivariate regression models with residual
covariance structures dictated by Mendelian genetics. Usually, specialized
software for structural equation modeling is used. However, the required
covariance structures can also be produced using mixed models and by specifying
an appropriate design matrix for the random part of the model. Stata’s
xtmixed command can then be used to estimate the models. For binary
phenotypes, such as diabetes, the appropriate probit models can be estimated
using gllamm.
Additional information
germany10_rabe-hesketh.pdf
germany10_rabe-hesketh.zip
An introduction to matching methods for causal inference and
their implementation in Stata
Barbara Sianesi
Institute for Fiscal Studies
Matching, especially in its propensity-score flavors, has become an extremely
popular evaluation method. Matching is, in fact, the best-available method for
selecting a matched (or reweighted) comparison group that looks
like the treatment group of interest.
In this talk, I will introduce matching methods within the general problem of
causal inference, highlight their strengths and weaknesses, and offer a brief
overview of different matching estimators. Using psmatch2, I will
then step through a practical example in Stata that is based on real data.
I will then show how to implement
some of these estimators, as well as highlight a number of
implementational issues.
Additional information
germany10_sianesi.pdf
germany10_sianesi_materials.zip
Heterogeneous treatment-effect analysis
Benn Jann
ETH Zürich
Methods for causal inference and the estimation of treatment effects have
received much attention in recent years. Most of the methodological and
applied work focuses on the identification of so-called average treatment
effects, possibly restricted to the treated or the untreated.
However, treatment effects may vary (hence the averaging), and it can be
interesting to analyze the patterns of effect heterogeneity. In this talk, I
will present a new command called hte that is used for heterogeneous
treatment-effect analysis in Stata. hte first constructs balanced
propensity-score strata and, within each stratum, estimates the average
treatment effect. hte then tests for a linear trend in effects across
the strata. The stratum-specific treatment effects and the estimated linear
trend are displayed in a two-way graph. hte results from joint
work with Jennie E. Brand (UCLA) and Yu Xie (University of Michigan).
Additional information
germany10_jann.pdf
Estimation of linear fixed-effects models with individual-specific slopes in Stata
Volker Ludwig
Mannheim Center for European Social Research (MZES)
Fixed-effects regression is considered a powerful method for estimating causal
effects with survey data. However, in the linear model, the conventional
technique of time-demeaning does not yield consistent estimates of the
parameters when unobserved heterogeneity is not time-constant. Jeffrey M. Wooldridge
(2002, Econometric Analysis of Cross
Section and Panel Data [MIT Press], 317–322)
derived a general model for the situation where unobserved and observed
characteristics of individuals interact to produce the outcome. The
fixed-effects model with individual constants and slopes (FEIS) is a remedy
for coefficients that are biased due to, for example, maturation or learning where
unobserved traits affect individual growth curves differently for treated
and controls.
The Stata xtfeis command implements the FEIS estimator in Mata, allowing for
individual constants and (potentially many) slopes. Without specifying slope
variables, the model collapses to the conventional model estimated by xtreg,
fe that accounts for individual constants only. xtfeis implements standard
errors that are robust to serial correlation or heteroskedasticity of
unknown form. Estimates of the slope parameters are available optionally.
The command requires panel data with at least J + 1 observations per unit,
where J is the number of individual-specific slope variables (usually, but
not necessarily, also including the individual-specific constant). I will
present results for the effect of marriage on male wages based on real data
(GSOEP and NLSY) to demonstrate the practical relevance of the method.
I will use simulation results to assess robustness of the estimator to
autocorrelation, measurement error, and misspecification of functional form.
Additional information
germany10_ludwig.pdf
RDS—a Stata program for respondent-driven sampling
Matthias Schonlau
DIW and Rand Corporation
Elisabeth Liebau
DIW
Respondent-driven sampling (RDS) is a sampling technique typically employed
for hard-to-reach populations (for example, homeless people, people with AIDS, immigrants).
Briefly, initial seed respondents recruit additional respondents from their
network of friends. The recruiting process repeats iteratively, thereby
forming long referral chains. It is crucial to obtain estimates of
respondents’ network sizes (for example, the number of friends with the
characteristic of interest). RDS shares some similarities with snowball
sampling, but the theoretical foundation for inference using RDS samples is
much stronger. We will give a brief overview of this technique and
introduce a new user-written Stata command for RDS.
Additional information
germany10_schonlau_liebau.ppt
Report to the users
Bill Gould
StataCorp LP
Bill Gould, president of StataCorp and head of development, talks about Stata.
|