# st: RE: Re: RE: Re: RE: statistical significance in a data set with weighted observations

 From "Sayer, Bryan" <[email protected]> To "'[email protected] '" <[email protected]> Subject st: RE: Re: RE: Re: RE: statistical significance in a data set with weighted observations Date Fri, 23 May 2003 17:47:47 -0400

```If they are truly frequency weights (i.e. your data points correspond to
cell counts in a contingency table) then you need to use procedures that
accept fweight as an option, or else you need to expand your observations by
fweight.  Your denominator is the sum of the weights, in this case.

Bryan Sayer
Statistician, SSS Inc.

-----Original Message-----
From: [email protected]
To: [email protected]
Sent: 5/23/03 4:22 PM
Subject: st: Re: RE: Re: RE: statistical significance in a data set with
weighted observations

I apologize if my question was confusing. I know that the weights in my
sample are frequency weights. The problem is not in accounting for
weights
in the regression but in the statistical significance of the
coefficients. I
remember from literature that with weighted data one must be careful
with
the interpretation of statistical significance, as t-statistics tend to
be
overstated. I am curious if anyone knows how to account for this
statistically.

MM

----- Original Message -----
From: "Copeland, Laurel" <[email protected]>
To: <[email protected]>
Sent: Friday, May 23, 2003 4:09 PM
Subject: st: RE: Re: RE: statistical significance in a data set with
weighted observations

> As I understand things, the t-statistics for the parameter estimates
> correctly reflect the importance of your predictors in your analysis,
> assuming the sample was taken as represented by the weights. The
effect is
> taken into account if you use -svy...- specifying weights (psu,
strata).
>
> If you do not include the weights (so analyze the small sample as if
it
were
> an entity unto itself), you will not get correct parameter estimates
(or
> accompanying t-statistics) to generalize.
>
> The fact that the t-statistics are significant or insignificant is
> immaterial.  Your approach need only be consistent with what the data
> actually represent.
>
> You mention fweights (frequency weights). I am assuming these are
1/pweight
> (probability weights) for your dataset.  If this is not the case, you
may
> need to find out more about your sample and how it was taken.
>
> Actually, you should find out as much as you can about your sample and
how
> it was taken, regardless.
>
> -Laurel
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]]
> Sent: Friday, May 23, 2003 3:32 PM
> To: [email protected]
> Subject: st: Re: RE: statistical significance in a data set with
> weighted observations
>
> Thank you!
>
> I do account for the weights using fweights in my regression, but the
> weights increase the impact of observations, and thereby impacting the
> t-statistics making the effect that all explanatory variables are
> significant. Is there a way of accounting for that effect on t-stats?
>
> thanks,
>
> Mikhail
> ----- Original Message -----
> From: "Copeland, Laurel" <[email protected]>
> To: <[email protected]>
> Sent: Friday, May 23, 2003 3:05 PM
> Subject: st: RE: statistical significance in a data set with weighted
> observations
>
>
> > The data can be weighted to reflect the sampling design.  The
sampling
> > design is complex to give you a sample that is representative of the
> > underlying population, and to allow inferential statistics.  The
complex
> > sampling lets you get a good sample of a large population of
unlisted
> > smaller units (e.g., all US residents), based on a complete list of
larger
> > units (e.g., US census tracts).  The weight is the inverse of the
> > probability of getting sampled.  In your sample, individual units
> > differing probabilities of being sampled, so they have differing
weights.
> > The calculated size of the population that is represented by your
sample
> > will be produced by Stata -svy-- commands. To analyze such a sample
> > properly, you must include the PSU, strata, and weights in your
analysis,
> if
> > they exist. Without the weights, the estimates you get will be
biased.
> > Sometimes weights are used to allow post-stratification (for
matching to
a
> > known distribution) or to deal with non-response.
> > -Laurel
> >
> > -----Original Message-----
> > From: [email protected] [mailto:[email protected]]
> > Sent: Friday, May 23, 2003 2:52 PM
> > To: [email protected]
> > Subject: st: statistical significance in a data set with weighted
> > observations
> >
> > Dear Stata Users,
> >
> > I have encountered this small problem and since I am not sure about
how
to
> > address it myself I've decided to ask you all. Thank you in advance
for
> any
> > advice you might have for me.
> >
> > I am working with a dataset that has weights for all observations,
and
> these
> > weights exhibit large variation, from 1 to over 500. When I run a
> > nonweighted estimation my t-statistics are relatively small, but
when
> > weights are introduced, the t-statistics jump. Is there a way of
> determining
> > the true statistical significance of coefficients in this case?
> >
> > Thanks again for any help you might have,
> >
> > MM
> >
>
