[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
st: Frequence weight in xtlogit and xtcloglog
A very nice feature of logit and cloglog estimation command is their
frequency weight ability. In estimating discrete-time survival model,
one can first generate "person-year" data by expand (or stsplit) the
original data set, collapse the expanded data set using strate, and
estimate models using a greatly reduced data (essentially a N-way
contingency table). The frequency weight makes this possible for both
logit and cloglog. Last night, I realized that this was not the case
for xtlogit and xtcloglog: in both cases, only "importance weight" are
Here is what I want to do: I have a number of (say, a hundred)
randomly selected community, within each community I have time to
event (say marriage) for a number of individuals, in addition to a
number of other categorical covariates. Within each community, I
collapse the individual-level data and generate a contigency table
data using stsplit and strate. I can estimate piecewise-constant model
using poisson with the "exposure" option, I can also estimate
discrete-time proportional odds model using logit with fw option, or
discrete-time proportional hazard model using cloglog with fw option.
Now I put all 100 communities together (by appending them) and try to
estimate random effect piecewise-constant, discrete-time proportional
odds, and discrete-time proportional hazard model. It appears to me
only random effect piecewise-constant is possible with Stata because
xtpoisson does not need frequency weight.
Here are my questions:
1). Is the above understanding correct?
2). Are there ways to be around this? I am especially interested in
the "offset" option in both the xtlogit and xtcloglog commands, and
wonder what will happen if I put the frequency variable (or logged
frequency in it). I have never quite understood what offset variables
did in a binary regression context.
3). Does GLLAMM have the same problem (does not allow frequency weight
in binary regression)?
4). Are there other alternative approach to achieve the same goal
(discrete-time hazard model using individual-level record is not
possible in my case because the data set is too big.)?
Thank you very much!
* For searches and help try: