Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Austin Nichols <austinnichols@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: weights for a longitudinal set (Was: probable error, "weights invalid" using stset] |

Date |
Thu, 3 Nov 2011 12:39:01 -0400 |

Stephen <S.Jenkins@lse.ac.uk>: Consider a simple discrete-time hazard model using a logit of y on dummies for time and X. We have weights in each of time periods 1968, 1969, ... 1993, and people drop out randomly conditional on time and X in every year 1969, ... 1993. Would you want to weight each observation based on their weight in the last year observed, or the 1968 weight? I contend you should use the 1968 weight, as that corresponds to the maximum likelihood estimator you are mimicking with your logit. The population here is the 1968 population eligible for inclusion in the sample, and anyone without a 1968 weight is excluded. For any other kind of analysis, you have to construct your own weight, but using the last observed weight will never be correct, as far as I can see, unless everyone appears in the same period at last observation (stock sampling with no follow-up, in which case there is no attrition correction in the weights, only nonresponse and raking). With stock sampling with no follow-up, you would want the first actual sample weights. For sampling with refreshment, things are more complicated, and there is no clear answer, but first observed weight seems to me to the most close to correct of any easy rule of thumb. On Thu, Nov 3, 2011 at 5:55 AM, <S.Jenkins@lse.ac.uk> wrote: > ------------------------------ > > Date: Wed, 2 Nov 2011 16:10:27 +0000 > From: "Brown, Elizabeth" <ebrown@prgs.edu> > Subject: RE: st: probable error, "weights invalid" using stset > > Oh, yes. That makes sense re: inverse probability of selection. Of > course. Thank you. > > - -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu > [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Austin > Nichols > Sent: Tuesday, November 01, 2011 12:37 PM > To: statalist@hsphsun2.harvard.edu > Subject: Re: st: probable error, "weights invalid" using stset > > Elizabeth <ebrown@prgs.edu> : > You are perhaps thinking of a cross-sectional regression later in the > survey, where you might want to adjust for attrition. But if you are > using obs from 1968 on, you do not want to use the later weights... > you want to use the weights that are the inverse probability of > selection into the sample in the first place. But things get very > complicated in the PSID, so if you want to be careful, you will have to > make your own weights to adjust for your particular sample selection > rules to make your sample representative of some larger population--but > first define that population, as I said before. All person-years in the > US 1968 to the present? Not possible. All those who were eligible or > had an ancestor eligible in 1968? Etc. > > <snip> > ==================== > > I'd like to second Austin's wise advice to consider seriously the > population that you are trying to represent with your sample. > > Nonetheless there are complications (as you realise). The "weights" > variables that are provided in most household panel surveys (indeed in > most household surveys) are general purpose weights, and may not be > relevant to your analysis, at least on a strict interpretation. Most > people simply use the weights provided however; largely because they are > there, I suspect. Also, few want to go down the route of creating their > own or, alternatively, jointly modeling the response process along with > the outcome process. > > I disagree with Austin about which weights to use for panel analysis. If > you are going to use the survey weights provided (subject to the last > paragraph's caveats), then I would use those for the /last/ wave > observed and not the first. Reason: in virtually all household panels I > am aware of, the longitudinal weights provided reflect not only design > factors (inverse probability of selection into the first wave), but also > correction for subsequent sample drop-out (attrition). I think the > "weights" in the US Panel Study of Income Dynamics are also longitudinal > weights of this kind. (Martha Hill's otherwise very useful introduction > to the PSID, published by Sage, isn't clear on this; my interpretation > comes from discussion with PSID staff several years ago.) * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: weights for a longitudinal set (Was: probable error, "weights invalid" using stset]***From:*<S.Jenkins@lse.ac.uk>

- Prev by Date:
**Re: st: RE: Re: my probit does not converge in Stata 11** - Next by Date:
**st: saving -proportion- estimates to a .dta or .csv file** - Previous by thread:
**st: weights for a longitudinal set (Was: probable error, "weights invalid" using stset]** - Next by thread:
**st: saving -proportion- estimates to a .dta or .csv file** - Index(es):