Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: dropping vars from analysis under conditions


From   Steve Samuels <sjsamuels@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: dropping vars from analysis under conditions
Date   Tue, 17 Apr 2012 23:44:24 -0400

Thanks, Richard. You (and Paul) are correct. The only reason to identify the individual is to use replication-based standard errors. Otherwise, standard errors are not based on iid observations but on conditional likelihoods.  I don't know how Katya's events were recorded, But if the measurements were grouped, I still think the -cloglog- approach is preferable.

My comment about time-dependent covariates was unwarranted, as I see that Katya's model has a variable with _tvc (time-varying covariate?) suffix. Katya  was just trying to give us the information she thought we needed to  answer her original question. I customarily like to step back and look at an entire analysis. I went too far here, and I apologize.


Steve
sjsamuels@gmail.com



On Apr 17, 2012, at 11:16 PM, Richard Williams wrote:

At 06:12 PM 4/17/2012, Steve Samuels wrote:
> I think Maarten  is correct.  Katya is trying for a discrete duration
> analysis, by adding the time intervals "interval2 interval3 interval4
> interval5 interval6 interval7 ".  The logistic model operates
> interval-by-interval.  Her event indicator is zero for all intervals
> except those in which  an event occurred. Although the number of
> observations is expanded, the number of events would not be; so the
> effective amount of information in the data would be unchanged.
> 
> 
> However I don't like Katya's analysis.  There's a lot I don't
> understand, because she did not describe her data well or show us the
> actual command.
> 
> 
> Among the issues:
> 
> 1) she doesn't include a cluster() option, so that standard errors
> are probably incorrect; 2) the parameters of the logistic model are
> not invariant to the choice of intervals; 3) the standard model would
> be a discrete hazard or cumulative log-log model; 4) if she has survey
> data, she is ignoring completely the sample design; 5) a discrete
> hazard model without time-dependent covariates over a long number of
> intervals is of doubtful use to me.

Paul Allison has written a couple of pieces about Discrete Time Methods for the Analysis of Event Histories. e.g. See his 1984 Green Sage Book on "Event History Analysis." I believe he shows the standard errors are correct and you don't need clustering. Being able to conveniently incorporate time-varying covariates is a big advantage of the approach. It also handles right-censoring well. I'm not sure about some of your other concerns, but I am guessing you could use the svy: prefix. My own example discussing this is at

http://www.nd.edu/~rwilliam/xsoc73994/Panel01-EHAX.pdf


-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME:   (574)289-5227
EMAIL:  Richard.A.Williams.5@ND.Edu
WWW:    http://www.nd.edu/~rwilliam

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index