Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: dropping vars from analysis under conditions

From   Steve Samuels <>
Subject   Re: st: dropping vars from analysis under conditions
Date   Tue, 17 Apr 2012 19:12:20 -0400

I think Maarten  is correct.  Katya is trying for a discrete duration
analysis, by adding the time intervals "interval2 interval3 interval4
interval5 interval6 interval7 ".  The logistic model operates
interval-by-interval.  Her event indicator is zero for all intervals
except those in which  an event occurred. Although the number of
observations is expanded, the number of events would not be; so the
effective amount of information in the data would be unchanged.

However I don't like Katya's analysis.  There's a lot I don't
understand, because she did not describe her data well or show us the
actual command.

 Among the issues:

 1) she doesn't include a cluster() option, so that standard errors
are probably incorrect; 2) the parameters of the logistic model are
not invariant to the choice of intervals; 3) the standard model would
be a discrete hazard or cumulative log-log model; 4) if she has survey
data, she is ignoring completely the sample design; 5) a discrete
hazard model without time-dependent covariates over a long number of
intervals is of doubtful use to me.

On Tue, Apr 17, 2012 at 8:18 AM, Nick Cox <> wrote:
> Sure, but that is not my point here. Katya said her data were expanded
> by length of time. Suppose I am an observation, you are an
> observation, and so on, and you -expand- by (e.g.) years on Statalist,
> months on Statalist, days on Statalist. (a) The answer is different in
> terms of implied sample size and (b) you replace individual
> observations by blocks of otherwise identical observations. As I said,
> sounds dubious to me. If Katya explains that she didn't do that, fine.
> If Katya explains that it does make sense, fine.
> Nick
> On Tue, Apr 17, 2012 at 1:09 PM, Maarten Buis <> wrote:
>> On Tue, Apr 17, 2012 at 12:35 PM, Nick Cox wrote:
>>> Expansion by time spent also sounds very dubious. If that means #
>>> observations for # units of time spent, well, the frequency
>>> interpretation depends on units of time being discrete, and on which
>>> units you use, and there is now a cluster structure.
>> There are situations where this can make sense. This can be used as a
>> trick to estimate a discrete time survival analysis model or a
>> sequential logit model. In those cases the total contribution of each
>> individual to the log-likelihood is the sum of the log-likelihoods of
>> passing each step/period/transition. It does not matter if we first
>> sum the contributions of each transition within a person and than sum
>> over the person (which is what a purpose written program (might) do),
>> or do the entire sum in one go (which is what you do when you expand).
>> So, the expansion can be used as a computational trick with which you
>> can estimate a survival model using programs that are not designed to
>> estimate a survival model.
>> Having said all that, using such tricks correctly is tricky. These
>> programs are not designed for that kind of analysis, and there can
>> easily be many options and post-estimation commands that will give you
>> output that does not make sense in this case. One example I can think
>> of right now is anything that relies on the sample size: e.g. BIC and
>> AIC values, but there may be (many) more. It is now up to the user to
>> understand what does and does not make sense. On the other hand Stata
>> has a whole suit of programs specifically designed for analyzing
>> survival data, see -help st-. Using these commands seem to me the
>> safer option.
>> Hope this helps,
>> Maarten
>> --------------------------
>> Maarten L. Buis
>> Institut fuer Soziologie
>> Universitaet Tuebingen
>> Wilhelmstrasse 36
>> 72074 Tuebingen
>> Germany
>> --------------------------
>> *
>> *   For searches and help try:
>> *
>> *
>> *
> *
> *   For searches and help try:
> *
> *
> *

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index