Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: help needed on discrete-time hazard model


From   "Lili Yan" <lyan16@gmail.com>
To   statalist <statalist@hsphsun2.harvard.edu>
Subject   Re: st: help needed on discrete-time hazard model
Date   Thu, 18 Oct 2007 13:27:25 -0400

Dear all,

I checked the data just now. After running logit model with our
dependent variable, the stored results show:

e(N) = 5463
e(N_cds) = 0
e(N_cdf) = 0

So seems there is something wrong in the data setup. Could anyone
please give me some help?

Thanks a lot!

Lili

On 10/18/07, Lili Yan <lyan16@gmail.com> wrote:
> Dear Statalist readers,
>
> I am a new user of Stata and now have problem with discrete-time
> hazard model. I am not sure whether I handle the model with correct
> Stata commands, or the data are set up correctly. If anyone can give
> me some suggestions or tips, I will truly appreciate it. Here is my
> question:
>
> We want to know whether higher price predicts quitting from smoking.
> We have a 3-wave survey on smokers in the US and Canada. We look at
> smokers with 4 smoking patterns: SSS, SSQ, SQS, and SQQ. SSS means one
> is a smoker at all 3 waves, SSQ means he or she smoks at first two
> waves but quit at the third wave, and so on.
>
> The data are set up this way (somebody else set up this actually, I
> think her setup is correct based on my limited knowledge on this
> model. Please let me know if there is anything incorrect here.):
>
> 1) Starting from one-row-per-person dataset, create a variable to
> indicate the number of waves that smokers are at "risk" of quitting.
> So, SSS and SSQ respondents are assigned value 3 and SQQ and SQS value
> 2.
> 2) Based on this indicator, expand the dataset, so SSS and SSQ have 3
> rows of observation per person, and SQS and SQQ 2 rows per person.
> 3) By uniqid, that is for each person, create a counter of rows.
> 4) By uniqid, create a binary variable "qtsmok" which equals 1 at the
> last row for SQQ, SQS and SSQ; it takes value 0 for all rows of SSS
> and other-than-last row(s) of SQQ, SQS and SSQ. This is the dependent
> variable of our model.
> 5) A "wave" variable is created, which takes values 1, 2 and 3 to
> indicate the wave; 3 dummies - wave1, wave2 and wave3 - are created as
> well.
>
> Then I set up the survey design with the strata and weights. I use
> -svy: logit- command. The explanatory variables include the
> conventional demographic and socioeconomic variables, price, a dummy
> variable for Canada, wave2 and wave3. Since in wave 1, everybody is
> smoker, no "quitting" event happens. So I do not include the "wave1"
> indicator in the equation. Besides, I use the option of "noconstant" -
> all my model setup is based on my reading of the on-line lecture notes
> by Prof. Stephen Jenkins in UK.
>
> The problem is the coefficient before our price variable is negative
> (small magnitude though) and significant at 1%! This is not what we
> expected. I tried many ways to explore:
> 1) removing "wave2" and "wave3"
> 2) removing survey setting
> 3) regression with only US or Canada sample
> 4) regression with the "wave" variable which has 3 values
> I got similar results each time. Then I tried more:
>
> 5) neglecting the fact that the dependent is binary, instead, I used
> "svy: reg", now the coefficient before price is positive and
> significant at 10%!
>
> 6) there is a categorical variable in the data set which defines
> smoking cessation stages: precontemplation, contemplation,
> preparation, action, and maintanence. A higher value of it means
> higher motivation to quit smoking or the quitting has already
> happened. This variable is positively correlated with the binary
> quitting dependent variable in this model. I cross-tabulated it with
> our dependent variable; it is consistent with our dependent variable -
> so again it seems that our dependent variable is correct.
> I ran OLS and ordered logit models for this cessation stage variable.
> In both models, the coefficients before the price are positive and
> significant.
>
> Based on this, I really do not know how to explain the negative and
> significant price coefficient in the hazard model (the logit model).
>
> I never did hazard model before, and I am still new to Stata. I am not
> sure whether my problem is in the data setup or in the modeling. Any
> suggestions will be greatly appreciated!
>
> Thanks for your time reading my question.
>
> Best,
> Lili
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index