[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Analyzing a subpopulation in Stata 10.1

From	"Karadogan, Figen" <[email protected]>
To	"[email protected]" <[email protected]>
Subject	RE: st: Analyzing a subpopulation in Stata 10.1
Date	Tue, 30 Jun 2009 15:49:41 +0000

Michael and Jeff,
Thank you so much for all your help regarding my question.  Your comments and responses were really helpful.

Figen Karadogan

________________________________________
From: [email protected] [[email protected]] on behalf of Jeff Pitblado, StataCorp LP [[email protected]]
Sent: Monday, June 29, 2009 10:51 PM
To: [email protected]
Subject: Re: st: Analyzing a subpopulation in Stata 10.1

"Michael I. Lichter" <[email protected]> had some follow-up comments and
questions about poststratification adjustments, missing values, and
subpopulation estimation:

> Thank you so much for your detailed response. It was very helpful in
> demonstrating the mechanics of Stata's poststratification adjustment to
> weights. I now understand how they work, but I'm not sure what they do
> makes sense for subpopulations.

> I can look at Table T4 to get estimated #s of women who have given
> birth, who have not given birth, and who we don't know about, all adding
> up to the subpouplation size. I can create a new weight (with some
> work), still based on the poststrata, that produces estimated counts of
> the women who have given birth and those who have not, which also add up
> to the subpopulation size.

> The counts in Table T3, however, don't add up to the subpopulation size
> and don't have a straightforward interpretation. Since they total to
> 2251, the table implies that 2374-2251 = 123 women have unknown status.
> It's unclear why that's a better number than the 260 estimated in T4. To
> me, the numbers in T3 have no substantive meaning ... and by extension
> proportions, regressions, etc., will be weighted in a manner that has no
> obvious interpretation.

> It seems to me that the right thing to do is either drop the missing
> data, like we do in T4 or ordinarily would if we were not using
> poststratification, or to produce estimates that sum to subpopulation
> totals through reweighting at the subpopulation level. Can you tell me
> why I'm wrong? Thanks.

The thing to keep in mind here is that the poststratification adjustment must
be applied to the entire estimation sample.

It is not possible to reweight at the subpopulation level unless there is
poststratification information at that level; i.e. if we had the postratum
population sizes for the four cells defined by sex and native status.

In table T3, -svy: tabulate- applies the weight adjustment to the 184
observations in the estimation sample.  The only way to prevent that is to fix
the adjusted weights ahead of time (see -help svygen-), but that isn't always
a good solution.  The poststratified sampling weights are designed to reduce
bias in the point estimates; however, with the postratum ID's -svy linearized-
can produce more efficient variance estimates than without.

Ultimately, it is the researcher/data-analyst that has the responsibility and
power to choose which analysis is most appropriate for themselves.

I can imagine real survey data where there are any number of different
poststratification adjustments one could apply for a given analysis.  Some
will make much more substantive sense than others.

Suppose we had a variable called -ns_postid- that simultaneously identified
the native status and sex of each individual in the dataset, and another
variable called -ns_postw- that contained the population size for the
corresponding group.  I think it is clear that this poststratification
information could be applied more broadly than the one in Michael's simulate
dataset.

--Jeff
[email protected]

PS.  There is an undocumented -svygen- command that will generate
poststratification adjusted samling weights; see -help svygen-.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- Re: st: Analyzing a subpopulation in Stata 10.1
  - From: [email protected] (Jeff Pitblado, StataCorp LP)

Prev by Date: Re: st: RE: RE: generate an error message if the wrong number of args is given following the args command
Next by Date: Re: st: Fixed Effect Negative Binomial with all zero outcome
Previous by thread: Re: st: Analyzing a subpopulation in Stata 10.1
Next by thread: Re: st: Analyzing a subpopulation in Stata 10.1
Index(es):
- Date
- Thread