Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

: Re: st: weights for a longitudinal set


From   <S.Jenkins@lse.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   : Re: st: weights for a longitudinal set
Date   Fri, 4 Nov 2011 09:47:02 -0000

I think Austin makes a good argument here. For the record, I'd had in
mind panel regression modelling (rather than survival analysis), and use
of longitudinal samples that had been refreshed.

Perhaps a take-away point that Austin and I would agree on is that the
choice of weights for longitudinal data analysis really is complicated.
And most textbook or even monograph discussions I am aware of do not
consider this issue in sufficient detail for many empirical analysts.
Moreover, to reiterate, most weights supplied with surveys are "general
purpose" weights. I suspect that that "general purpose" often doesn't
correspond with the analysis topic and sample that many of us -- which
leads me to support Austin's remarks about considering developing your
own weights, or indeed jointly modelling the outcome of interest and
response (initial and retention)

I also observe that many economists have 'avoided' these issues by
simply not using any weights at all! (Without discussion.)


++++++++++++++++++
Date: Thu, 3 Nov 2011 12:39:01 -0400
From: Austin Nichols <austinnichols@gmail.com>
Subject: Re: st: weights for a longitudinal set (Was: probable error,
"weights invalid" using stset]

Stephen <S.Jenkins@lse.ac.uk>:

Consider a simple discrete-time hazard model using a logit of y on
dummies for time and X.  We have weights in each of time periods 1968,
1969, ... 1993, and people drop out randomly conditional on time and X
in every year 1969, ... 1993.  Would you want to weight each
observation based on their weight in the last year observed, or the
1968 weight?  I contend you should use the 1968 weight, as that
corresponds to the maximum likelihood estimator you are mimicking with
your logit. The population here is the 1968 population eligible for
inclusion in the sample, and anyone without a 1968 weight is excluded.
For any other kind of analysis, you have to construct your own weight,
but using the last observed weight will never be correct, as far as I
can see, unless everyone appears in the same period at last
observation (stock sampling with no follow-up, in which case there is
no attrition correction in the weights, only nonresponse and raking).
With stock sampling with no follow-up, you would want the first actual
sample weights.  For sampling with refreshment, things are more
complicated, and there is no clear answer, but first observed weight
seems to me to the most close to correct of any easy rule of thumb.

Please access the attached hyperlink for an important electronic communications disclaimer: http://lse.ac.uk/emailDisclaimer

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index