Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: weights for a longitudinal set (Was: probable error, "weights invalid" using stset]


From   Austin Nichols <austinnichols@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: weights for a longitudinal set (Was: probable error, "weights invalid" using stset]
Date   Thu, 3 Nov 2011 12:39:01 -0400

Stephen <S.Jenkins@lse.ac.uk>:

Consider a simple discrete-time hazard model using a logit of y on
dummies for time and X.  We have weights in each of time periods 1968,
1969, ... 1993, and people drop out randomly conditional on time and X
in every year 1969, ... 1993.  Would you want to weight each
observation based on their weight in the last year observed, or the
1968 weight?  I contend you should use the 1968 weight, as that
corresponds to the maximum likelihood estimator you are mimicking with
your logit. The population here is the 1968 population eligible for
inclusion in the sample, and anyone without a 1968 weight is excluded.
For any other kind of analysis, you have to construct your own weight,
but using the last observed weight will never be correct, as far as I
can see, unless everyone appears in the same period at last
observation (stock sampling with no follow-up, in which case there is
no attrition correction in the weights, only nonresponse and raking).
With stock sampling with no follow-up, you would want the first actual
sample weights.  For sampling with refreshment, things are more
complicated, and there is no clear answer, but first observed weight
seems to me to the most close to correct of any easy rule of thumb.

On Thu, Nov 3, 2011 at 5:55 AM,  <S.Jenkins@lse.ac.uk> wrote:
> ------------------------------
>
> Date: Wed, 2 Nov 2011 16:10:27 +0000
> From: "Brown, Elizabeth" <ebrown@prgs.edu>
> Subject: RE: st: probable error, "weights invalid" using stset
>
> Oh, yes. That makes sense re: inverse probability of selection. Of
> course. Thank you.
>
> - -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu
> [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Austin
> Nichols
> Sent: Tuesday, November 01, 2011 12:37 PM
> To: statalist@hsphsun2.harvard.edu
> Subject: Re: st: probable error, "weights invalid" using stset
>
> Elizabeth <ebrown@prgs.edu> :
> You are perhaps thinking of a cross-sectional regression later in the
> survey, where you might want to adjust for attrition.  But if you are
> using obs from 1968 on, you do not want to use the later weights...
> you want to use the weights that are the inverse probability of
> selection into the sample in the first place.  But things get very
> complicated in the PSID, so if you want to be careful, you will have to
> make your own weights to adjust for your particular sample selection
> rules to make your sample representative of some larger population--but
> first define that population, as I said before.  All person-years in the
> US 1968 to the present?  Not possible.  All those who were eligible or
> had an ancestor eligible in 1968? Etc.
>
> <snip>
> ====================
>
> I'd like to second Austin's wise advice to consider seriously the
> population that you are trying to represent with your sample.
>
> Nonetheless there are complications (as you realise). The "weights"
> variables that are provided in most household panel surveys (indeed in
> most household surveys) are general purpose weights, and may not be
> relevant to your analysis, at least on a strict interpretation. Most
> people simply use the weights provided however; largely because they are
> there, I suspect. Also, few want to go down the route of creating their
> own or, alternatively, jointly modeling the response process along with
> the outcome process.
>
> I disagree with Austin about which weights to use for panel analysis. If
> you are going to use the survey weights provided (subject to the last
> paragraph's caveats), then I would use those for the /last/ wave
> observed and not the first. Reason: in virtually all household panels I
> am aware of, the longitudinal weights provided reflect not only design
> factors (inverse probability of selection into the first wave), but also
> correction for subsequent sample drop-out (attrition). I think the
> "weights" in the US Panel Study of Income Dynamics are also longitudinal
> weights of this kind. (Martha Hill's otherwise very useful introduction
> to the PSID, published by Sage, isn't clear on this; my interpretation
> comes from discussion with PSID staff several years ago.)

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index