[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Lili Yan <lyan16@gmail.com> |

To |
statalist <statalist@hsphsun2.harvard.edu> |

Subject |
st: help needed on discrete-time hazard model |

Date |
Thu, 18 Oct 2007 10:29:45 -0400 |

Dear Statalist readers, I am a new user of Stata and now have problem with discrete-time hazard model. I am not sure whether I handle the model with correct Stata commands, or the data are set up correctly. If anyone can give me some suggestions or tips, I will truly appreciate it. Here is my question: We want to know whether higher price predicts quitting from smoking. We have a 3-wave survey on smokers in the US and Canada. We look at smokers with 4 smoking patterns: SSS, SSQ, SQS, and SQQ. SSS means one is a smoker at all 3 waves, SSQ means he or she smoks at first two waves but quit at the third wave, and so on. The data are set up this way (somebody else set up this actually, I think her setup is correct based on my limited knowledge on this model. Please let me know if there is anything incorrect here.): 1) Starting from one-row-per-person dataset, create a variable to indicate the number of waves that smokers are at "risk" of quitting. So, SSS and SSQ respondents are assigned value 3 and SQQ and SQS value 2. 2) Based on this indicator, expand the dataset, so SSS and SSQ have 3 rows of observation per person, and SQS and SQQ 2 rows per person. 3) By uniqid, that is for each person, create a counter of rows. 4) By uniqid, create a binary variable "qtsmok" which equals 1 at the last row for SQQ, SQS and SSQ; it takes value 0 for all rows of SSS and other-than-last row(s) of SQQ, SQS and SSQ. This is the dependent variable of our model. 5) A "wave" variable is created, which takes values 1, 2 and 3 to indicate the wave; 3 dummies - wave1, wave2 and wave3 - are created as well. Then I set up the survey design with the strata and weights. I use -svy: logit- command. The explanatory variables include the conventional demographic and socioeconomic variables, price, a dummy variable for Canada, wave2 and wave3. Since in wave 1, everybody is smoker, no "quitting" event happens. So I do not include the "wave1" indicator in the equation. Besides, I use the option of "noconstant" - all my model setup is based on my reading of the on-line lecture notes by Prof. Stephen Jenkins in UK. The problem is the coefficient before our price variable is negative (small magnitude though) and significant at 1%! This is not what we expected. I tried many ways to explore: 1) removing "wave2" and "wave3" 2) removing survey setting 3) regression with only US or Canada sample 4) regression with the "wave" variable which has 3 values I got similar results each time. Then I tried more: 5) neglecting the fact that the dependent is binary, instead, I used "svy: reg", now the coefficient before price is positive and significant at 10%! 6) there is a categorical variable in the data set which defines smoking cessation stages: precontemplation, contemplation, preparation, action, and maintanence. A higher value of it means higher motivation to quit smoking or the quitting has already happened. This variable is positively correlated with the binary quitting dependent variable in this model. I cross-tabulated it with our dependent variable; it is consistent with our dependent variable - so again it seems that our dependent variable is correct. I ran OLS and ordered logit models for this cessation stage variable. In both models, the coefficients before the price are positive and significant. Based on this, I really do not know how to explain the negative and significant price coefficient in the hazard model (the logit model). I never did hazard model before, and I am still new to Stata. I am not sure whether my problem is in the data setup or in the modeling. Any suggestions will be greatly appreciated! Thanks for your time reading my question. Best, Lili * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

