[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: panel data sets based on complex survey design
I have the same issue, and using the Consumer Expenditure Survey put
out by the US Bureau of Labor Statistics. They have a whole series
of weights for estimating means, and "half sample weights" for
constructing "balanced" replicates to do sort of a bootstrap for
estimating variances of estimators.
But I'm not sure how to best combine this with a method that uses
individual fixed effects. I suppose the issue, since this is not a
balanced panel, is whether to count someone who appears in the
survey for 4 periods more or less than someone who aeppears in the
survey for 3 periods. It seems to me a fair way would be to use a
person's weight in the first period they appear, as suggested below
I think. Probably doesn't matter much for the results.
I am also confused about how to estimate the variance of a variable
itself, rather than the variance of an estimate. Any references to
recommend on this?
Lets keep in touch. Maybe share techniques and even data.
--- In email@example.com, Jenkins S P <stephenj@e...> wrote:
> On Tue, 25 May 2004 michael.alexander@i... wrote:
> > Users, I posted this query a couple of weeks ago but it didnt
seem to spark
> > any responses. Im posting it again in the hope someone has some
> > the matter.
> The Statalist FAQ provides advice about why posts may not be
> whether to re-post to all, without re-writing or reconsideration.
> > I am about to undertake analysis on the Household, Income and
> > Dynamics in Australia (HILDA) survey using Stata version 8
(which I have
> > recently obtained access to). HILDA is a panel data set of
> > households and individuals. I was hoping to find that the latest
> > Stata had the capability to deal with data that was both
> > nature and of a complex survey design (I guess a combination of
the xt and
> > svy commands). However, my initial scan of the guides doesn't
> > anything that can deal with both these issues simultaneously.
> > How do other users analysising HILDA (and other longtitudinal
> > with this issue of longtitudinal data when there are significant
> > startification and clustering issues? Any thoughts greatly
> There probably aren't commands routinely available because it is
> that one should account for complex survey design in a panel in
> manner. The approach in effect assumes that design affects can be
> using appropriate weights (and accounting for the clustering and
> stratification). But where do the weights come from? Usually
> all-purpose weights (and may not even be special longitudinal
> and derived, broadly speaking, from regressions of the probability
> retention with loads of RHS variables.
> Economists and others often approach this differently (call it a
> 'modelling' approach rather than a 'weighting' approach): they
> retention probability jointly with the process of interest. The key
> difference from the weighting approach is that one allows for
> between the unobservable factors determining retention and process
> interest. (Of course there are arguments about identification to
> resolved as well.) See Journal of Human Resources 1998 special
> attrition etc in longitudinal surveys, and references therein.
> If you are simply after crosstabs and other descriptives from
> data with an originally complex design, then it is more common to
> weighting approach, and these commands are available in Stata of
> (-svytab- etc.). Again there is the issue of the weights (which
> Which vbles get used in the -svyset-ting of the clustering and
> stratification when you have multiple waves? Not totally clear,
> survey statistician colleague of mine once recommended that you
> cluster/psu and strata from wave 1 of the survey.
> [NB1 issues get much more complicated when one e.g. pools annual
> transitions from successive years of a panel survey. It is not
> weights one should use in this case -- the longitudinal weight
> second year in each case?]
> [NB2 Working out which type of weights to use is a tricky business.
> Different household panels provide different sorts of weights,
> cross-sectional and longitudinal, and for enumerated individuals,
> respondents, and households. The PSID does not distinguish between
> cross-sectional and longitudinal weights, whereas the BHPS and the
> do -- though the last two provide longitudinal weights in
> I don't know what sort of weights HILDA use.]
> Stephen (from the home of the BHPS)
> Professor Stephen P. Jenkins <stephenj@e...>
> Institute for Social and Economic Research (ISER)
> University of Essex, Colchester CO4 3SQ, UK
> Phone: +44 1206 873374. Fax: +44 1206 873151.
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
* For searches and help try: