[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
st: RE: German Stata User Group Meeting - repost
Last message sent in error - please ignore
Epidemiology & Genetics Unit
Department of Health Sciences
Area 3 Seebohm Rowntree Building
University of York
TEL: +44(0) 1904 321879
FAX: +44(0) 1904 321899
From: email@example.com [mailto:firstname.lastname@example.org] On Behalf Of Johannes Giesecke
Sent: 19 March 2007 07:00
Subject: st: German Stata User Group Meeting - repost
This is a repost, because the email by Ulrich Kohler concerning the SUG Programm was difficult to read for some reason.
Fifth Stata-User-Meeting in Essen**
The 5th German Stata Users Group Meeting will be held on Monday, 2nd April 2007 in Essen at the RWI (Rheinisch-Westfälisches Institut für Wirtschaftsforschung). We would like to invite everybody from everywhere who is interested in using Stata to attend this meeting.
The academic program of the meeting is being organized by Johannes Giesecke, University of Mannheim (<email@example.com>),
John P. Haisken-DeNew, RWI Essen (<firstname.lastname@example.org>), and Ulrich Kohler, WZB (<email@example.com>). The conference language will be English due to the international nature of the meeting and the participation of non-German guest speakers.
The logistics of the conference are being organized by Dittrich und Partner, distributor of Stata in several countries including Germany, The Netherlands, Austria, and Poland (http://www.dpc.de).
8:30 - 9:00 Reception
9:00 - 9:10 Welcome
John P. HaiskenDeNew, RWI Essen
9:10 - 9:50 Why should you become a Stata programmer?
Kit Baum <firstname.lastname@example.org>, Boston College Economics
9:50 - 10:30 Making regression tables simplified
Ben Jann <email@example.com>, ETH Zurich
-estout-, introduced by Jann (2005), is a useful tool for producing regression tables from stored estimates. However, its syntax is relatively complex and commands may turn out lengthy even for simple tables. Furthermore, having to store the estimates beforehand can be a bit cumbersome. To facilitate the production of regression tables, I therefore present two new commands called -esto- and -esta-. -esto- is a wrapper for official Stata's -estimates store- and simplifies the storing of estimation results for tabulation. For example, -esto- does not require the user to provide names for the stored estimation sets.
-esta-, on the other hand, is a wrapper for -estout- and simplifies compiling nice-looking tables from the stored estimates without much typing. Basic applications of the commands and usage of -esta- with external software such as LaTeX, Word, or Excel will be illustrated by a range of examples.
10:30 - 10:45 Coffee
10:45 - 11:15 Assessing the reasonableness of an imputation model
Maarten L. Buis <M.Buis@fsw.vu.nl>, Vrije Universiteit Amsterdam
Multiple imputation is a popular way of dealing with missing values under the Missing At Random (MAR) assumption. Imputation models can become quite complicated, for instance when the model of substantive interest contains many interactions, or when the data originate from a nested design. This paper will discuss two methods to assess how plausible the results are. The first method consists of comparing the point estimates obtained by multiple imputation with point estimates obtained by another method for controlling for bias due to missing data.
Second the changes in standard error between the model that ignores the missing cases and the multiple imputation model are decomposed into three components: changes due to changes in 'sample size', changes due to uncertainty in the imputation model used in multiple imputation, and changes due to changes in the estimates that underlie the standard error. This decomposition helps in assessing the reasonableness of the change in standard error. These two methods will be illustrated with two new user written Stata commands.
11:15 - 11:45 The influence of categorizing survival time on parameter estimates in a Cox model
Anika Buchholz, Willi Sauerbrei, Patrick Royston
University of Freiburg, University Medical Center Freiburg, MRC Clinical Trials Unit, London
With longer follow-up times the proportional hazards assumption is questionable in the Cox model. Cox suggested to include an interaction between a covariate and a function of time. To estimate such a function in Stata a substantial enlargement of the data is required. This may cause severe computational problems. We will consider categorizing survival time, which raises issues as to the number of cutpoints, their position, the increased number of ties and the loss of information, to handle this problem. Sauerbrei et al. (2007) proposed a new selection procedure to model potential time-varying effects. They investigate a large data set (N=2982) with 20 years follow-up, for which the Stata command stsplit creates about 2.2 million records. Categorizing the data in 6 month intervals gives 35747 records. We will systematically investigate the influence of the length of categorization intervals and the four methods of handling ties in Stata. The results of our categorization a!
pproach are promising, showing a sensible way to handle time-varying effects even in simulation studies.
References: Sauerbrei, W., Royston, P. and Look, M. (2007). A new proposal for multivariable modelling of time-varying effects in survival data based on fractional polynomial time-transformation.
Biometrical Journal, in press
11:45 - 12:15 Oaxaca/Blinder Decompositions for Non-Linear Models
Matthias Sinning <firstname.lastname@example.org> and Markus Hahn <email@example.com>, RWI Essen, University of Bochum
This paper describes the estimation of a general Blinder-Oaxaca decomposition of the mean outcome differential of linear and non-linear regression models. Departing from this general model, it is shown how it can be applied to different models with discrete and limited dependent variables.
12:15 - 13:15 Lunch
13:15 - 13:45 Estimating Double-Hurdle Models with Dependent Errors and Heteroscedasticity"
Julian A. Fennema <J.A.Fennema@hw.ac.uk>, Heriot-Watt University, Edinburgh
This paper describes the estimation of the parameters of a double hurdle model in Stata. It is shown that the independent double-hurdle model can be estimated using a combination of existing commands. Likelihood evaluators to be used with Stata's ml facilities are derived to illustrate how to fit independent and dependent inverse hyperbolic sine double-hurdle models with heteroscedasticity."
13:45 - 14:15 Measuring Richness
Andreas Peichl <firstname.lastname@example.org> University of Cologne
In this paper, we describe richness, a Stata program for the calculation of richness indices. Peichl, Schaefer and Scheicher (2006) propose a new class of richness measures to contribute to the debate how to deal with the financing problems that European welfare states face as a result of global economic competition. In contrast to the often used headcount, these new measures are sensitive to changes in rich persons' income.
This allows for a more sophisticated analysis of richness, namely the question whether the gap between rich and poor is widening. We propose to use our new measures in addition to the headcount index for a more comprehensive analysis of richness.
14:15 - 14:45 Robust income distribution analysis
Philippe Van Kerm <email@example.com>, CEPS/INSTEAD, Luxembourg
Extreme data are known to be highly influential when measuring income inequality from micro-data. Similarly, Lorenz curves and dominance criteria are very sensitive to data contamination in the tails of the distribution. In this presentation, I intend to introduce a set of user-written packages that implement robust statistical methods for income distribution analysis. These methods are based on the estimation of parametric models (Pareto, Singh-Maddala) using "optimal B-robust"
estimators rather than maximum likelihood. Empirical examples show how robust inequality estimates and dominance checks can be derived from these models.
14:45 - 15:00 Coffee
15:00 - 15:25 PanelWhiz: A Stata Interface for Large Scale Panel Data Sets
John P. Haisken-DeNew <firstname.lastname@example.org>, RWI Essen
This paper outlines a panel data retrieval program written for Stata/SE or better, which allows easier accessing of the household panel data sets. Using a drop-down menu system, the researcher selects variables from any and all available years of the panel. The data is automatically retrieved and merged to form a "long file", which can be directly used by the Stata panel estimators. The system implements modular data cleaning programs called "plugins". Yearly updates to the data retrievals can be made automatically. Projects can be stored in libraries allowing modular administration and appending. PanelWhiz is available for SOEP, IAB-Betriebspanel, HILDA, CPS-NBER, CPS-CEPR. Other popular data sets will be supported soon.
15:25 - 15:50 PanelWhiz Plugins: Automatic Vector-Oriented Data Cleaning for Large Scale Panel Data Sets
Markus Hahn <email@example.com>, RWI Essen and University of Bochum
PanelWhiz "plugins" are modular data cleaning programs for specific items in PanelWhiz. Each plugin is designed to recode, deflate, change existing variables being extracted in a panel-data retrieval.
Furthermore new variables can be generated on the fly. The PanelWhiz plugin system is a "macro language" that uses new-style dialog boxes and Stata's modularized "class" system, allowing a vector orientation for data cleaning. The PanelWhiz plugins can even be generated using a PanelWhiz plugin front-end, allowing users to create plugins but not have to write Stata code themselves. The system is set up to allow data cleaning of ANY PanelWhiz supported data set.
15:50 - 16:15 A model for transferring variables between different data-sets based on imputation of individual scores
Bojan Todosijevic <B.Todosijevic@utwente.nl>, University of Twente
It is an often encountered problem that variables of interest are scattered in different data sets. Given the two methodologically similar surveys, a question not asked in one survey could be seen as a special case of missing data problem (Gelman et al., 1998). The paper presents a model for transferring variables between different data-sets applying the procedures for multiple imputation of missing values. The feasibility of this approach was assessed using two Dutch surveys:
Social and Cultural Developments in The Netherlands (SOCON 2000) and the Dutch Election Study (NKO 2002). An imputation model for the left-right ideological self-placement was developed based on the SOCON survey. In the next step, left-right scores were imputed to the respondents from the NKO study. The outcome of the imputation was evaluated, first, by comparing the imputed variables with the left-right scores collected in three waves of the NKO study. Second, the imputed and the original NKO left-right variables are compared in terms of their associations with a broad set of attitudinal variables from the NKO data set. The results show that one would reach similar conclusions using the original or imputed variable, albeit with the increased risk of making Type II errors.
16:15 - 16:30 Coffee
16:30 - 17:00 Two Issues on Remote data access
Peter Jacobebbinghaus <Peter.Jacobebbinghaus@iab.de>, IAB
At the Research Data Centre of the BA at the IAB, researchers can send in Stata programs to be processed there with the log files sent back to them after a disclosure limitation review. This method of data access is called remote data access and the reason we do this is data confidentiality. Remote data access has two non-standard requirements:
Efficient use of the computer resources and automation of parts of the disclosure limitation review. I would like to talk about how we deal with these requirements and discuss ways to improve them.
17:00 - 17:30 Report to the Users
Bill Rising, StataCorp
17:30 - 18:00 Wishes and Grumbles
Participants are asked to travel at their own expense. There will be a small conference fee to cover costs for coffee, teas, and luncheons.
There will also be an optional informal meal at a restaurant in Essen on Monday evening at additional cost.
You can enroll by contacting Anke Mrosek by email or by writing, phoning, or faxing to
Dittrich & Partner Consulting GmbH
Kieler Str. 17
Tel: +49 (0) 212 260 66-24
Fax:+49 (0) 212 260 66 -66
We look forward to seeing you in Essen on April 2nd where you can help us to make this an exciting and interesting event.
The conference venue is:
Johannes Giesecke, John P. Haisken-DeNew, Ulrich Kohler
Dr. Johannes Giesecke
Fakultät für Sozialwissenschaften
Lehrstuhl für Methoden der empirischen Sozialforschung und angewandte Soziologie
Seminargebäude A5, Bauteil A
Tel.: (0621) 181 2045
Fax: (0621) 181 2048
* For searches and help try:
* For searches and help try: