Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Number of Obs with svy , suppop()

From	Phil Schumm <[email protected]>
To	[email protected]
Subject	Re: st: Number of Obs with svy , suppop()
Date	Fri, 19 Mar 2010 14:42:34 -0500

On Mar 19, 2010, at 3:17 AM, Michael Norman Mitchell wrote:

Thank you for your reply... I am still struggling to solidlyunderstand this. Perhaps I have a more fundamental question. What isthe formula for the "Number of obs" in the context of the -svy-commands. It sounds like, in the absence of the -subpop()- option,it is the number of observations with non-missing values on thetabulated variable. And, in the presence of the -subpop()- option itis the total number of observations minus the number of observationsthat meet the -subpop()- option and are missing on the tabulatedvariable. Am I on the right track here?

Yes, I believe this is correct (note however that I haven't lookedinto this carefully, so if you need confirmation of Stata's behaviorWRT this issue, you'll need to get it from the manual or from someonelike Jeff). One more thing I should mention: How you proceed in caseslike this may depend on the reason(s) that the data are missing. Forexample, suppose the missing values for race are due to respondentsrefusing to answer the question or saying "I don't know." In thatcase, Durbin argued that this should be taken into account whendefining the subpopulation (also referred to in the survey literatureas a domain).[1] IOW, in your example, the subpopulation of interestwould be "all males who, when asked, will provide an answer to thisquestion." In this case, you would augment your -subpop()-specification like this:


    svy, subpop(if sex==1 & !mi(race)):

in which case the "number of observations" reported by Stata shouldnow correspond to the total number of observations in your dataset.More importantly, this would specify a slightly different variancecalculation, though the actual result may only differ very slightly(if at all) depending on the circumstances. Note that I almost neversee anyone do this -- at least not in the applied social scienceliterature.

Of course, what I just described does nothing to address the possiblebias that might arise if those who don't respond differ (in terms ofrace) from those who do...



-- Phil

[1] J. Durbin. Sampling theory for estimates based on fewerindividuals than the number selected. Bulletin of the InternationalStatistical Institute, 36(3):113–119, 1958.



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Number of Obs with svy , suppop()
  - From: Michael Mitchell <[email protected]>
- Re: st: Number of Obs with svy , suppop()
  - From: Phil Schumm <[email protected]>
- Re: st: Number of Obs with svy , suppop()
  - From: Michael Norman Mitchell <[email protected]>

Prev by Date: st: re: Newey, F
Next by Date: st: simultaneous tobit
Previous by thread: Re: st: Number of Obs with svy , suppop()
Next by thread: Re: st: Number of Obs with svy , suppop()
Index(es):
- Date
- Thread