# st: RE: Re: RE: statistical significance in a data set with weighted observations

 From "Copeland, Laurel" <[email protected]> To "'[email protected]'" <[email protected]> Subject st: RE: Re: RE: statistical significance in a data set with weighted observations Date Fri, 23 May 2003 13:09:15 -0700

```As I understand things, the t-statistics for the parameter estimates
assuming the sample was taken as represented by the weights. The effect is
taken into account if you use -svy...- specifying weights (psu, strata).

If you do not include the weights (so analyze the small sample as if it were
an entity unto itself), you will not get correct parameter estimates (or
accompanying t-statistics) to generalize.

The fact that the t-statistics are significant or insignificant is
immaterial.  Your approach need only be consistent with what the data
actually represent.

You mention fweights (frequency weights). I am assuming these are 1/pweight
(probability weights) for your dataset.  If this is not the case, you may
need to find out more about your sample and how it was taken.

Actually, you should find out as much as you can about your sample and how
it was taken, regardless.

-Laurel

-----Original Message-----
From: [email protected] [mailto:[email protected]]
Sent: Friday, May 23, 2003 3:32 PM
To: [email protected]
Subject: st: Re: RE: statistical significance in a data set with
weighted observations

Thank you!

I do account for the weights using fweights in my regression, but the
weights increase the impact of observations, and thereby impacting the
t-statistics making the effect that all explanatory variables are
significant. Is there a way of accounting for that effect on t-stats?

thanks,

Mikhail
----- Original Message -----
From: "Copeland, Laurel" <[email protected]>
To: <[email protected]>
Sent: Friday, May 23, 2003 3:05 PM
Subject: st: RE: statistical significance in a data set with weighted
observations

> The data can be weighted to reflect the sampling design.  The sampling
> design is complex to give you a sample that is representative of the
> underlying population, and to allow inferential statistics.  The complex
> sampling lets you get a good sample of a large population of unlisted
> smaller units (e.g., all US residents), based on a complete list of larger
> units (e.g., US census tracts).  The weight is the inverse of the
> differing probabilities of being sampled, so they have differing weights.
> The calculated size of the population that is represented by your sample
> will be produced by Stata -svy-- commands. To analyze such a sample
> properly, you must include the PSU, strata, and weights in your analysis,
if
> they exist. Without the weights, the estimates you get will be biased.
> Sometimes weights are used to allow post-stratification (for matching to a
> known distribution) or to deal with non-response.
> -Laurel
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]]
> Sent: Friday, May 23, 2003 2:52 PM
> To: [email protected]
> Subject: st: statistical significance in a data set with weighted
> observations
>
> Dear Stata Users,
>
> I have encountered this small problem and since I am not sure about how to
any
> advice you might have for me.
>
> I am working with a dataset that has weights for all observations, and
these
> weights exhibit large variation, from 1 to over 500. When I run a
> nonweighted estimation my t-statistics are relatively small, but when
> weights are introduced, the t-statistics jump. Is there a way of
determining
> the true statistical significance of coefficients in this case?
>
>
> MM
>
