Last updated: 8 June 2013
2013 German Stata Users Group meeting 
 Friday, 7 June 2013 
 
  University of Potsdam
  Germany
Proceedings
 Creating complex tables for publication 
 John Luke Gallup 
 Portland State University 
  Complex statistical tables often must be built up by parts from the results
  of multiple Stata commands. I show the capabilities of 
frmttable and
  
outreg for creating complex tables, and even fully formatted
  statistical appendices, for Word and TeX documents. Precise formatting of
  these tables from within Stata has the same benefits as writing do-files
  for statistics commands. They are reproducible and reusable when the data
  change, saving the user time.
   
   
Additional information
   de13_gallup.pdf
 
 An expanded framework for mixed process modeling in Stata
 David Roodman 
 Center for Global Development 
  Roodman (Stata Journal, 2011) introduced the program 
cmp for using
  maximum likelihood to fit multiequation combinations of Gaussian-based
  models such as tobit, probit, ordered probit, multinomial probit, interval
  censoring, and continuous linear.  This presentation describes substantial
  extensions to the framework and software:  factor variable support; the
  rank-ordered probit model; the ability to specify precensoring truncation in
  most model types; hierarchical random effects and coefficients that are
  potentially correlated across equations; the ability to include the
  unobserved linear variables behind endogenous variables—not just their
  observed, censored manifestations—on the right side of other equations
  and, when so doing, the allowance for simultaneity in the system of
  equations.  Contrary to the title of Roodman (2011), models no longer need
  be recursive or fully observed.
  
   
Additional information
   de13_roodman.pptx
 
 Provide, Enrich, and Make Accessible: Using Stata’s Capabilities
for Disseminating NEPS Scientific Use Data
 Daniel Bela 
 National Educational Panel Study (NEPS), Data Center, University of Bamberg
  The National Educational Panel Study (NEPS) is rising as one of
  Germany's major publisher of scientific use data for educational research.
  Disseminating data from six panel cohorts makes not only structured data
  editing but also documentation and user support a major challenge. In order
  to accomplish this task, the NEPS Data Center has implemented a sophisticated
  metadata system. It does not only allow the structured documentation of the
  metadata of survey instruments and data files. It also allows one to enrich
  the scientific use files with further information, thus significantly easing
  access for data analyses. As a result, NEPS provides bilingual dataset files
  (German and English) and allows the user to instantly see, for instance, the
  exact wording of the question leading to the data in a distinct variable
  without leaving the dataset. To achieve this, structured metadata is
  attached to the data using Stata's characteristics functionality. To make
  handling additional metadata even easier, the NEPS Data Center provides a
  package of user-written programs, 
NEPStools, to data users. The
  presentation will cover an introduction to the NEPS data preparation
  workflow, focusing on the metadata system and its role in enriching the
  scientific use data by using Stata's capabilities.  Afterward,
  
NEPStools will be introduced.
  
   
Additional information
   de13_bela.pdf
 
 newspell—Easy Management of Complex Spell Data 
 Hannes Neiss 
 German Institute for Economic Research
  Biographical data gathered in surveys is often stored in spell format,
  allowing for overlaps between spell states. This gives useful information to
  researchers but leaves them with a very complex data structure, which is
  not easy to handle. I present my work on the ado-package newspell. It
  includes several subprograms for management of complex spell data. Spell
  states can be merged, reducing the overall number of spells.  newspell
  allows a user to fill gaps with information from spells before and after the
  gap, given a user-defined preference. However, the two most important
  features of newspell are, first, the ability to rank spells and cut off
  overlaps according to the rank order. This is a necessary step before
  performing, for example, sequence analysis on spell data. Second, newspell
  can combine overlapping spells into new categories of spells, generating
  entirely new states. This is useful for cleaning data, for analyzing
  simultaneity of states, or for combining two spell datasets that have
  information on different kinds of states (for example, labor market and
  marital status). newspell is useful for users who are not familiar with
  complex spell data and have little experience in Stata programming for data
  management. For experienced users, it saves a lot of time and coding work.
  
 
  Additional information
  de13_kroeger.pdf
 
 Instrumental variables estimation using heteroskedasticity-based
instruments 
 Christopher F. Baum 
 Boston College 
 Arthur Lewbel 
 Boston College 
 Mark E. Schaffer
 Heriot–Watt University, Edinburgh 
 Oleksandr Talavera 
 University of Sheffield
  In a 2012 article in the Journal of Business and Economic Statistics, Arthur
  Lewbel presented the theory of allowing the identification and estimation of
  "mismeasured and endogenous regressor models" by exploiting
  heteroskedasticity. These models include linear regression models
  customarily estimated with instrumental variables (IV) or IV-GMM techniques.
  Lewbel's method, under suitable conditions, can provide instruments where no
  conventional instruments are available or augment standard instruments to
  enable tests of overidentification in the context of an exactly identified
  model. In this talk, I discuss the rationale for Lewbel's methodology and
  illustrate its implementation in a variant of Baum, Schaffer, and Stillman'
  
sivreg2 routine, 
ivreg2h.
  
   
Additional information
   de13_baum.pdf
 
 Using simulation to inspect the performance of a test, in
particular tests of the parallel regressions assumption in ordered logit and
probit models 
 Maarten L. Buis 
 Social Science Research Center (WZB) 
 Richard Williams
 University of Notre Dame
  In this talk, we will show how to use simulations in Stata to explore to
  what extent and under what circumstances a test is problematic. We will
  illustrate this for a set of tests of the parallel regression assumption in
  ordered logit and probit models: the Brant, likelihood ratio, Wald, score,
  and Wolfe-Gould test of the parallel regression assumption. A common
  impression is that these tests tend to be too anti-conservative; that is,
  they tend to reject a true null hypothesis too often. We will use
  simulations to try to quantify when and to what extent this is the case. We
  will also use these simulations to create a more robust bootstrap variation
  of the tests. The purpose of this talk is twofold: first, we want to explore
  the performance of these tests. For this purpose, we will present a new
  program, oparallel, that implements all tests and their bootstrap variation.
  Second, we want to give more general advice on how to use Stata to create
  simulations when one has doubts about a certain test. For this purpose, we
  will present the 
simpplot command, which can help to interpret the
  p-values returned by such a simulation.
  
   
Additional information
   de13_buis.pdf
 
 Fitting Complex Mixed Logit Models with Particular Focus on
Labor Supply Estimation
 Max Löffler 
 Institute for the Study of Labor (IZA)
  When one estimates discrete choice models, the mixed logit approach is
  commonly superior to simple conditional logit setups. Mixed logit models not
  only allow the researcher to implement difficult random components but also
  overcome the restrictive IIA assumption. Despite these theoretical
  advantages, the estimation of mixed logit models becomes cumbersome when the
  model’s complexity increases. Applied works therefore often rely on rather
  simple empirical specifications because this reduces the computational
  burden. I introduce the user-written command 
lslogit, which fits
  complex mixed logit models using maximum simulated likelihood methods. As
  
lslogit is a d2-ML-evaluator written in Mata, the estimation is
  rather efficient compared with other routines. It allows the researcher to
  specify complicated structures of unobserved heterogeneity and to choose
  from a set of frequently used functional forms for the direct utility
  function—for example, Box-Cox transformations, which are difficult to
  estimate in the context of logit models. The particular focus of
  
lslogit is on the estimation of labor supply models in the discrete
  choice context; therefore, it facilitates several computationally exhausting
  but standard tasks in this research area. However, the command can be used
  in many other applications of mixed logit models as well.
  
   
Additional information
   de13_loeffler.pdf
 
 Simulated Multivariate Random Effects Probit Models for Unbalanced Panels
 Alexander Plum
 Otto-von-Guericke University Magdeburg
  This paper develops an implementation method of a simulated multivariate
  random-effects probit model for unbalanced panels, illustrating it by using
  artificial data. By mdraws, generated Halton draws are used to simulate
  multivariate normal probabilities with the command 
mvnp(). The
  estimator can be easily adjusted (for example, to allow for autocorrelated
  errors). Advantages of this simulated estimation are high accuracy and lower
  computation time compared with existing commands such as 
redpace.
  
   
Additional information
   de13_plum.pdf
 
 xsmle—A Command to Estimate Spatial Panel Models in Stata
 Federico Belotti 
 University of Rome "Tor Vergata" 
 Gordon Hughes 
 University of Edinburgh
 Andrea Piano Mortari
 University of Rome "Tor Vergata" 
  Econometricians have begun to devote more attention to spatial interactions
  when carrying out applied econometric studies. The new command we are
  presenting, 
xsmle, fits fixed- and random-effects spatial models for
  balanced panel data for a wide range of specifications: the spatial
  autoregressive model, spatial error model, spatial Durbin model, spatial
  autoregressive model with autoregressive disturbances, and generalized
  spatial random effect model with or without a dynamic component. Different
  weighting matrices may be specified for different components of the models
  and both Stata matrices and spmat objects are allowed. Furthermore,
  
xsmle calculates direct, indirect, and total effects according to
  Lesage (2008), implements Lee and Yu (2010) data transformation for
  fixed-effects models, and may be used with 
mi prefix when the panel
  is unbalanced.
  
   
Additional information
   de13_mortari.pdf
 
 Estimating the dose-response function through the GLM approach
 Barbara Guardabascio 
 Italian National Institute of Statistics, Rome 
 Marco Ventura
 Italian National Institute of Statistics, Rome 
  How effective are policy programs with continuous treatment exposure?
  Answering this question essentially amounts to estimating a dose-response
  function as proposed in Hirano and Imbens (2004). Whenever doses are not
  randomly assigned but are given under experimental conditions, estimation
  of a dose-response function is possible using the Generalized Propensity
  Score (GPS). Since its formulation, the GPS has been repeatedly used in
  observational studies, and ad hoc programs have been provided for Stata users
  (
doseresponse and 
gpscore, Bia and Mattei 2008). However, many
  applied works remark that the treatment variable may not be normally
  distributed. In this case, the Stata programs are not usable because they do
  not allow for different distribution assumptions other than the normal
  density. In this paper, we overcome this problem. Building on Bia and
  Mattei's (2008) programs, we provide 
doseresponse2 and
  
gpscore, which allow one to accommodate different distribution
  functions of the treatment variable. This task is accomplished through by
  the application of the generalized linear models estimator in the first step
  instead of the application of maximum likelihood. In such a way, the user
  can have a very versatile tool capable of handling many practical
  situations. It is worth highlighting that our programs, among the many
  alternatives, take into account the possibility to consistently use the GPS
  estimator when the treatment variable is fractional, the flogit case by
  Papke and Wooldridge (1998), a case of particular interest for economists.
  
   
Additional information
   de13_ventura.ppt
 
Predictive Margins and Marginal Effects in Stata
 Ben Jann
 University of Bern
  Tables of estimated regression coefficients, usually accompanied by
  additional information such as standard errors, 
t statistics,
  
p-values, confidence intervals, or significance stars, have long been
  the preferred way of communicating results from statistical models. In
  recent years, however, the limits of this form of exposition have been
  increasingly recognized. For example, interpretation of regression tables
  can be very challenging in the presence of complications such as interaction
  effects, categorical variables, or nonlinear functional forms.  Furthermore,
  while these issues might still be manageable in the case of linear
  regression, interpretational difficulties can be overwhelming in nonlinear
  models (for example, logistic regression). To facilitate sensible
  interpretation of these models, one must often compute additional results
  such as marginal effects, predictive margins, or contrasts.  Moreover, smart
  graphical displays of results can be very valuable in making complex
  relations accessible. A number of helpful commands geared at supporting
  these tasks have been recently introduced in Stata, making elaborate
  interpretation and communication of regression results possible without much
  extra effort. Examples of these commands are 
margins,
  
contrasts, and 
marginsplot. In my talk, I will discuss the
  capabilities of these commands and present a range of examples illustrating
  their use.
  
   
Additional information
   de13_jann.pdf
 
Scientific organizers
Johannes Giesecke, University of Bamberg
[email protected]
Ulrich Kohler, University of Potsdam
[email protected]
Logistics organizers
The conference is sponsored and organized by Dittrich & Partner Consulting GmbH
(http://www.dpc.de),
the distributor of Stata in several countries, including
Germany, The Netherlands, Austria, Czech Republic, and Hungary.