Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: panel data sets based on complex survey design


From   "daaronR" <drein@uclink.berkeley.edu>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: panel data sets based on complex survey design
Date   Fri, 28 May 2004 00:12:15 -0000

I have the same issue, and using the Consumer Expenditure Survey put
out by the US Bureau of Labor Statistics.  They have a whole series
of weights for estimating means, and "half sample weights" for
constructing "balanced" replicates to do sort of a bootstrap for
estimating variances of estimators.

But I'm not sure how to best combine this with a method that uses
individual fixed effects.  I suppose the issue, since this is not a
balanced panel, is whether to count someone who appears in the
survey for 4 periods more or less than someone who aeppears in the
survey for 3 periods.   It seems to me a fair way would be to use a
person's weight in the first period they appear, as suggested below
I think.  Probably doesn't matter much for the results.

I am also confused about how to estimate the variance of a variable
itself, rather than the variance of an estimate.  Any references to
recommend on this?

Lets keep in touch.  Maybe share techniques and even data.
David

--- In statalist@yahoogroups.com, Jenkins S P <stephenj@e...> wrote:
> On Tue, 25 May 2004 michael.alexander@i... wrote:
>
> > Users, I posted this query a couple of weeks ago but it didnt
seem to spark
> > any responses. Im posting it again in the hope someone has some
thoughts on
> > the matter.
>
> The Statalist FAQ provides advice about why posts may not be
answered and
> whether to re-post to all, without re-writing or reconsideration.
>
> > I am about to undertake analysis on the Household, Income and
Labour
> > Dynamics in Australia (HILDA) survey using Stata version 8
(which I have
> > recently obtained access to). HILDA is a panel data set of
Australian
> > households and individuals. I was hoping to find that the latest
version of
> > Stata had the capability to deal with data that was both
longtitudinal in
> > nature and of a complex survey design (I guess a combination of
the xt and
> > svy commands). However, my initial scan of the guides doesn't
reveal
> > anything that can deal with both these issues simultaneously.
> >
> > How do other users analysising HILDA (and other longtitudinal
surveys) deal
> > with this issue of longtitudinal data when there are significant
> > startification and clustering issues? Any thoughts greatly
appreciated.
>
> There probably aren't commands routinely available because it is
not clear
> that one should account for complex survey design in a panel in
this
> manner. The approach in effect assumes that design affects can be
dealt
> using appropriate weights (and accounting for the clustering and
> stratification).  But where do the weights come from?  Usually
they are
> all-purpose weights (and may not even be special longitudinal
weights),
> and derived, broadly speaking, from regressions of the probability
of
> retention with loads of RHS variables.
>
> Economists and others often approach this differently (call it a
> 'modelling' approach rather than a 'weighting' approach): they
model the
> retention probability jointly with the process of interest. The key
> difference from the weighting approach is that one allows for
correlations
> between the unobservable factors determining retention and process
of
> interest.  (Of course there are arguments about identification to
be
> resolved as well.) See Journal of Human Resources 1998 special
issue on
> attrition etc in longitudinal surveys, and references therein.
>
> If you are simply after crosstabs and other descriptives from
longitudinal
> data with an originally complex design, then it is more common to
take a
> weighting approach, and these commands are available in Stata of
course
> (-svytab- etc.).  Again there is the issue of the weights (which
ones).
> Which vbles get used in the -svyset-ting of the clustering and
> stratification when you have multiple waves?  Not totally clear,
but a
> survey statistician colleague of mine once recommended that you
use the
> cluster/psu and strata from wave 1 of the survey.
>
> [NB1 issues get much more complicated when one e.g. pools annual
> transitions from successive years of a panel survey. It is not
clear what
> weights one should use in this case -- the longitudinal weight
from the
> second year in each case?]
>
> [NB2 Working out which type of weights to use is a tricky business.
> Different household panels provide different sorts of weights,
> cross-sectional and longitudinal, and for enumerated individuals,
adult
> respondents, and households.  The PSID does not distinguish between
> cross-sectional and longitudinal weights, whereas the BHPS and the
GSOEP
> do -- though the last two provide longitudinal weights in
different ways.
> I don't know what sort of weights HILDA use.]
>
> Stephen (from the home of the BHPS)
> =============================================
> Professor Stephen P. Jenkins <stephenj@e...>
> Institute for Social and Economic Research (ISER)
> University of Essex, Colchester CO4 3SQ, UK
> Phone: +44 1206 873374.  Fax: +44 1206 873151.
> http://www.iser.essex.ac.uk
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index