Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Dummy variables in longitudinal models

From   Niels Schenk <>
To   <>
Subject   st: Dummy variables in longitudinal models
Date   Tue, 28 Sep 2010 14:10:35 +0200

Dear statalist,

I am having trouble getting my head around the following issue. It's not directly related to Stata but I'm hoping you are willing to help me out. The dataset I use is a panel of respondents (see example below). Some of these respondents transition into a certain state during the period of observation, some don't. The variable 'trans' denotes if a respondent has transitioned at a certain time point. In my example, respondents 1 and 2 have, respondent 3 hasn't. I'm trying to determine if and how the transition affects my dependent variable. The basic model I'm estimating is: 'xtmixed depvar trans || time: '. I know I can use xtreg, but I'm using a simplified representation of the model I'm estimating.

My issue is the following:
I have both continuous and categorical variables that I want to use in my estimation. Estimating the effect of my variable 'contvar', and distinguishing the effect between those that have and have not transitioned seems straightforward:
xtmixed depvar trans contvar trans#c.contvar || time:

However, the categorical variable I would like to use is only available for those that have made the transition. Creating dummy variables out of this categorical variable yields three dummy variables, one is the reference category. The problem I'm having with this is that for those respondents who have not transitioned, all dummy variables are zero. For those that have transitioned, one of them is one. When estimating the model xtmixed depvar trans dummy1 dummy2 || time:. From the results I want to be able to conclude that respondents that have transitioned where dummy1==1, are significantly different in the depvar from respondents that have transitioned where dummy3==1. I'm thinking though that the reference category here is blurred, I can be either dummy3, or the case where the categorical variable is zero (i.e. not applicable). My conclusion would be that this approach is simply not valid in that the coefficients do not represent what they are supposed to represent, but I have seen papers that use such an approach.

My question is if people on this list think that this use of dummy variables described above is appropriate or not? Estimating a model only for those that have transitioned is simply not an option. If it is not appropriate, perhaps some of you have suggestions on how to estimate the model I would like to estimate.

Another totally unrelated question I have is if it is possible to mean-center dummy variables over time (within respondents) to separate the fixed and time-varying effects of dummy variables as is possible with continuous variables? If so, any pointers to papers that do would be greatly appreciated.

Thanks very much for any help,

Niels Schenk

Simplified representation of my dataset:

id	time	trans	depvar	contvar	catvar	dummy1	dummy2	dummy3
1	1	0	4	34	0	0	0	0
1	2	0	5	55	0	0	0	0
1	3	1	6	43	1	0	1	0
1	4	1	3	34	2	1	0	0
2	1	0	7	54	0	0	0	0
2	2	1	6	23	1	0	1	0
2	3	1	5	32	2	1	0	0
2	4	1	6	34	3	0	0	1
3	1	0	3	54	0	0	0	0
3	2	0	4	43	0	0	0	0
3	3	0	7	23	0	0	0	0
3	4	0	6	23	0	0	0	0

*   For searches and help try:

© Copyright 1996–2016 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index