[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: sampling weight in --sum,--fsum

From   Steven Samuels <[email protected]>
To   [email protected]
Subject   Re: st: sampling weight in --sum,--fsum
Date   Tue, 9 Sep 2008 12:22:26 -0400

Mandy, To answer your questions, you need to distinguish two kinds of description:

a. Of the sample itself
b. Of the population that the sample is supposed to represent.

a. For describing the sample itself, use the basic Stata commands like -sum- -tab- without weights. This is important--readers will want to know the composition of the sample, for example, were the sample sizes in important subgroups big enough to give reliable results?

b. Describe the population? For many populations, you would not do "basic description". For example, if the population is the US or a State, most characteristics about it are known from other sources. The survey weights may even have been adjusted to reproduce what is known about the population--its age and gender distribution, for example. On the other hand, if the population is unique, not known from other sources, then description might be the point of the survey. For example, you might have a sample of a district in a city. To describe the population you would use -svy- commands, not ordinary Stata commands, as Joao stated in his email.

To expand on Joao's answer to your second question: With weights, but without the cluster and stratum identifications, regressions with the -svy- prefix can give good point estimates, but tests, p-values, standard errors and confidence intervals will all be wrong. Usually the reported SE's will be smaller than the actual SE's and the reported p-values will be "too significant".


On Sep 9, 2008, at 10:39 AM, Man Jia wrote:

Hi everyone,

I have a simple question about survey data. I'd like to report the
basic statistic description for a survey data set. But the commands
like --sum --fsum can't use the option of pweight (probability weight).
What can be used is either fweigtht or aweight.

Could anyone help me with the following questions?

1) How should I get the basic statistics for a survey data which the
sampling weight is known?

2) Now I have no information of strata or cluster. What I only know is
the sampling weight assigned to each individual. When I just use
pweight to do OLS regressions with svy command, should the
missing-strata--and--cluster problem be a concern ?

Thanks for your help!


* For searches and help try:
*   For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index