Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: 4th German Users' Group Meeting: Final Announcement and Program

From   Ulrich Kohler <>
Subject   st: 4th German Users' Group Meeting: Final Announcement and Program
Date   Tue, 7 Mar 2006 12:49:08 +0100

4th German Stata Users' Group Meeting: Announcement and Program

The 4th German Stata Users' Group Meeting will be held at the 
University of Mannheim ( on Friday, March 31th 2006.

The content of the meeting has been organized by Johannes Giesecke, 
University of Mannheim (, Ulrich Kohler, 
WZB (, and Fred Ramb, Deutsche Bundesbank 
( The logistics are being organized by Dittrich 
and Partner (, the distributor of Stata in several countries 
including Germany and Austria.

The meeting is open to all interested, and we will be happy if Stata users 
from neighboring countries join us. StataCorp will be represented. The
conference language will be English due to the 'international' nature of the
meeting and the participation of non-German guest speakers. There will be a
"wishes and grumbles" session at which you may air your thoughts to Stata
developers. There will also be an optional informal meal at a Mannheim 
restaurant on Friday evening (at additional cost of 20 Euro).

Participants are asked to travel on their own fees. There will be a small
conference fee (regular 20 Euro, students 10 Euro) to cover costs for coffee,
teas, and luncheons.

For further information on registration, please contact 
Mrs. Mrosek will also assist you in finding an accommodation. For general 
information about the meeting see also

Readers of previous announcements should note that the conference venue 
has changed to Room W 117, located in the Schloss. You will find an exact plan 
of the conference venue on

Note: Counting the number of windows, the Schloss of Mannheim is the 
biggest palace in Europe. Even if you don't trust the indicator, believe 
us: the Schloss is big. We therefore ask you to plan ample time. It is 
not difficult to find the Schloss, in Mannheim, but it probably is 
difficult to find the room within the Schloss.

Schedule of the 3rd German Stata Users' Group Meeting

 8:45 Registration and coffee/tea

 9:15 Welcome
      Johannes Giesecke

 9:30 Resultssets, resultsspreadsheets and resultsplots in Stata
      Roger Newson, Imperial College London

      Most Stata users make their living producing results in a form
      accessible to end users. Most of these end users cannot immediately
      understand Stata logs. However, they can understand tables (in paper,
      PDF, HTML, spreadsheet or word processor documents) and plots (produced
      using Stata or non--Stata software). Tables are produced by Stata as
      resultsspreadsheets, and plots are produced by Stata as resultsplots.
      Sometimes (but not always), resultsspreadsheets and resultsplots are
      produced using resultssets. Resultssets, resultsspreadsheets and
      resultsplots are all produced, directly or indirectly, as output by
      Stata commands. A resultsset is a Stata dataset, which is a table,
      whose rows are Stata observations and whose columns are Stata variables.
      A resultsspreadsheet is a table in generic text format, conforming to a
      TeX or HTML convention, or to another convention with a column separator
      string and possibly left and right row delimiter strings. A resultsplot
      is a plot produced as output, using a resultsset or a resultsspreadsheet
      as input. Resultsset--producing programs include -statsby-, -parmby-,
      -parmest-, -collapse-, -contract-, -xcollapse- and -xcontract-.
      Resultsspreadsheet--producing programs include -outsheet-, -listtex-,
      -estout- and -estimates table-. Resultsplot--producing programs include
      -eclplot- and -mileplot-. There are two main approaches (or dogmas) for
      generating resultsspreadsheets and resultsplots. The resultsset--centred
      dogma is followed by -parmest- and -parmby- users, and states: 
      ``Datasets make resultssets, which make resultsplots and  
       resultsspreadsheets''. The resultsspreadsheet--centred dogma is
      followed by -estout- and -estimates table- users, and states:
      ``Datasets make resultsspreadsheets, which make resultssets, which make
       resultsplots''. The two dogmas are complementary, and each dogma has
      its advantages and disadvantages. The resultsspreadsheet dogma is much
      easier for the casual user to learn to apply in a hurry, and is
      therefore probably preferred by most users most of the time.
      The resultsset dogma is more difficult for most users to learn, but is
      more convenient for users who wish to program everything in
      do-files, with little or no manual cutting and pasting.

10:20 Coffee


10:30 Intervention evaluation using -gllamm-
      Andrew Pickles, University of Manchester

      The gllamm procedure provides a framework within which many of the 
      more difficult analyses required for trials and intervention studies
      may be undertaken.

      Treatment effect estimation in the presence of non-compliance can be 
      undertaken using instrumental variable (IV) methods.  We illustrate how
      gllamm can be used for IV estimation for the full range of types of
      treatment and outcome  measures and describe how missing data may be
      tackled on an assumption of latent ignorability.  Alternative
      approaches to account for clustering and the analysis of 
      cluster-randomised studies will also be described.

      Examples from studies of alcohol consumption of primary care patients,
      cognitive behaviour therapy of depression patients and a school based
      smoking intervention are discussed.
11:20 Estimating IRT models with -gllamm-
      Herbert Matschinger, University of Leipzig
      Within the framework of economic evaluation, health econometricians 
      are interested in constructing a meaningful health index that is
      consistent with individual or societal preferences. One way to 
      derive such an index is based on the EQ-5D description and valuation
      of health related quality of life (HRQOL). The purpose of this study 
      was to analyze how well the EQ-5D reflects one latent construct of 
      HRQOL and how large is the potential impact of measurement variance 
      with respect to six different countries. Data came from the European 
      Study of the Epidemiology of Mental Disorders (ESEMeD), a
      cross-sectional survey of a representative random sample (N=21,425) 
      in Belgium, France, Germany, Italy, the Netherlands and Spain. At 
      least in psychology much attention is paid to different forms of 
      IRT models and particularly the Rasch model, since it is the only 
      model featuring specific objectivity which enables what is called a
      “fair comparison” with respect to the latent dimension to be measured.
      Therefore the dimensionality of the construct is evaluated by means 
      of one-parameter and two-parameter Item Response Theory (IRT).
      Differential Item Functioning is tested with respect to the six
      countries and both the difficulty and discrimination parameters. 
      Results show, that a unidimensional one-parameter IRT model holds 
      for all countries if only the item “anxiety/depression” is omitted. 
      If both the physical and the mental component of health related 
      (HRQOL) should be represented the questionnaire should be extended
      to a two-dimensional construct. Consequently, more items to portray
      the mental component are then needed. This presentation will focus on
      the possibilities and restrictions in estimating these models with
      -gllamm-. It will be shown how these models can be established 
      and tested. Problems regarding the structure of the data and the
      assignment of incidental parameters to individual observations will 
      be discussed. 

General Statistics

11:50 Variance estimation for Generalized Entropy and Atkinson 
      inequality indices: the complex survey data case
      Martin Biewen, University of Frankfurt

      We derive the sampling variances of Generalized Entropy and Atkinson
      indices when estimated from complex survey data, and show how they 
      can be calculated straightforwardly using widely- available software. 
      We also show that, when the same approach is used to derive variance
      formulae for the i.i.d. case, it leads to estimators that are simpler
      than those proposed before. Both cases are illustrated with a 
      comparison of income inequality in Britain and Germany.

12:20 Lunch

13:30 Linear mixed models in Stata
      Roberto G. Gutierrez, StataCorp

      Included with Stata version 9 is the new command xtmixed, for fitting
      linear mixed models. Mixed models containing both fixed and random
      effects. The fixed effects are analagous to standard regression
      coefficients and are estimated directly. The random effects are not
      directly estimated but are summarized according to the unique elements
      of their respective variance–covariance matrices, known as variance
      components. xtmixed syntax is summarized and demonstrated using several
      examples.  In addition, xtmixed and its postestimation routines may be
      used to perform nonparametric smoothing via penalized splines.

User Written Programs

14:20 Implementing Restricted Least Squares in Linear Models 
      J. Haisken-DeNew, RWI Essen 

      The presentation illustrates the user written program -hds97-, 
      which implements the restricted least squares procedure as 
      described by Haisken-DeNew and Schmidt (1997). Log wages are 
      regressed on a group of k-1 industry/region/job/etc dummies. 
      The k-th dummy is the omitted reference dummy. Using RLS, all 
      k dummy coefficients and standard errors are reported. The 
      coefficients are interpreted as percent-point deviations from the 
      industry weighted average. An overall measure of dispersion is
      also reported. 

      This ado corrects problems with the Krueger and Summers (1988)
      Econometrica methodology of overstated differential standard 
      errors, and understated overall dispersion.

      General comments: The coefficients of continuous variables are 
      not affected by -hds97-. Also, all results calculated in -hds97- 
      are independent of the choice of the reference category. By the way, 
      for all dummy variable sets having only two outcomes, i.e. male/female,
      the t-values of the hds97 adjusted coefficients are always equal 
      in magnitude, but opposite in sign.

14:50 Sequence analysis using Stata
      Christian Brzinsky-Fay, WZB; Ulrich Kohler, WZB

      Sequences are ordered lists of elements. A typical example for a 
      sequence is the sequence of bases in the DNS of creatures. Other
      examples are sequences of employment stages during life time, or
      individual party-preferences over time. Sequence analysis include 
      techniques to handle, describe, and, most importantly, to compare
      sequences among each other. 

      Sequences are most commonly used by scholars of genomes, but far 
      less by social scientist. This is in so far surprising as sequence 
      data is readily available in many datasets for the social sciences. 
      In fact, all data from panel studies can be regarded as sequence data.     
      Despite that, social scientists relatively seldom use panel data for
      sequence analysis. The first aim of the presentation therefore is to
      illustrate a typical research topics that can be dealt with sequence
      analysis. The second part will then describe a bundle of user written
      Stata programs for sequence analysis, including a Mata algorithm for
      performing optimal matching with the so called "Needleman-Wunsch"  

15:30 Coffee

15:40 New Tools for Evaluating the Results of Cluster Analyses
      Hildegard Schaeper, HIS

      Clustering methods are designed for finding groups in data, for 
      grouping similar objects (variables or observations) into the same
      cluster and dissimilar objects into separate clusters. Whereas this 
      main idea is rather simple, carrying out a cluster analysis remains a
      challenging task: The number of different clustering methods is huge 
      and clustering includes many choices, such as the decision between basic
      approaches (e. g. hierarchical and partitioning methods), the choice of
      a dissimilarity or similarity measure, the selection of a particular
      linkage method when performing a hierarchical agglomerative cluster
      analysis, the choice of an initial partition when carrying out a
      partitioning cluster analysis, and the determination of the 
      appropriate number of clusters. Each of these decisions and choices 
      can affect the classification results. 
      Apart from two commands for determining the number of clusters 
      (cluster stop, cluster dendrogram) Stata has no inbuilt utilities 
      which allow to examine clustering results. We, therefore, developed 
      some simple tools which provide additional evaluation criteria: 

      – programs assisting in determining the number of clusters 
       (Mojena’s stopping rules for hierarchical clustering techniques, 
        PRE coefficient, F-Max statistic and Beale’s F values for 
        a partitioning cluster analysis),

      – a program for testing the stability of classifications produced 
        by different cluster analyses (Rand index), and 

      – a program that computes ETA2 in order to assess how well the
        clustering variables separate the clusters.

      In the presentation these programs will be presented, and their
      usefullness will be discussed in comparison with other tools for 
      the evaluation of clustering results (agglomeration schedule, 
      scree diagram).

Towards an Open Wish List to StataCorp

16:10 Stata goes BUGS (via R)
      Susumu Shikano, University of Mannheim

      Recently, Bayesian methods such as Markov chain Monte Carlo (MCMC)
      techniques find an increasing use in the social sciences, with 
      (Win)BUGS being one of the most widely applied software for this 
      kind of analysis. Unfortunately, due to the absence of MCMC 
      techniques and any interfaces to WinBUGS or BUGS in Stata, Stata 
      users who apply MCMC techniques have to perform such painful tasks 
      as reformatting data by themselves. As a preliminary solution to 
      this problem, one can call another statistical software R from inside
      Stata and use it as an interface to (Win)BUGS. This presentation
      outlines this solution providing an exemplar analysis.

16:40 Optimal Large Package Administration for Stata
      Markus Hahn, RWI Essen

      The Stata package tool is quite simple to use for smaller ADO packages
      stored on user webpages. However when the number of files in a package
      becomes large and the files need to be updated on a regular basis, this
      becomes cumbersome. Package updates could take many minutes to complete.
      Here a method of storing packages as compressed archives on the host
      server is outlined, whereby the user sends a query to the update server
      to check for a new version. If a new version is available, the package
      archive is downloaded in its entirety, and then extracted and installed
      locally. This is far more efficient with respect to installation times
      (typically only 1/10 of the time needed) than downloading many text
      files individually. For large packages, the bottleneck is most often the
      download time. Currently this automated updating can be achieved with a
      Stata Ado and the aid of additional binaries (such as tar, gzip, zip).
      The usability of this technique would be enhanced dramatically if the
      functionality of an archiving format (such as tar, gzip, zip) were
      directly integrated into the Stata binary. Even encrpyted files could be
      distributed in this manner as well. Ado files inside the package archive
      can be configured to make an automatic call to the host server to check
      for available updates.

17:10 Coffee

17:20 Report to the users
      Alan Riley, StataCorp

17:50 Wishes and Grumpels

18:30 End of the Meeting

+49 (030) 25491-361

*   For searches and help try:

© Copyright 1996–2020 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index