Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Number of Obs with svy , suppop()


From   Michael Norman Mitchell <[email protected]>
To   [email protected]
Subject   Re: st: Number of Obs with svy , suppop()
Date   Fri, 19 Mar 2010 01:17:05 -0700

Dear Phil

Thank you for your reply... I am still struggling to solidly understand this. Perhaps I have a more fundamental question. What is the formula for the "Number of obs" in the context of the -svy- commands. It sounds like, in the absence of the -subpop()- option, it is the number of observations with non-missing values on the tabulated variable. And, in the presence of the -subpop()- option it is the total number of observations minus the number of observations that meet the -subpop()- option and are missing on the tabulated variable. Am I on the right track here?

Many thanks!

Michael N. Mitchell
See the Stata tidbit of the week at...
http://www.MichaelNormanMitchell.com

On 2010-03-18 5.04 PM, Phil Schumm wrote:
On Mar 18, 2010, at 6:20 PM, Michael Mitchell wrote:
Here is the tabulation of race and sex by race.

<snip>

. tab sex race, missing

   1=male, |          1=white, 2=black, 3=other
  2=female |     White      Black      Other          . |     Total
-----------+--------------------------------------------+----------
      male |     1,676        193         35         34 |     1,938
    female |     1,824        238         34         37 |     2,133
-----------+--------------------------------------------+----------
     Total |     3,500        431         69         71 |     4,071

<snip>

But now I want to analyze just the sub-population of males (sex==1) and it shows that the number of obs is now 4037 (see below). How can the number of observations increase when adding a -subpop()- option? There are suddenly 37 extra observations. Note this corresponds to the number of females with a missing race.

. svy , subpop(if sex==1): tab race, count format(%13.2fc)
(running tabulate on estimation sample)

Number of strata = 1 Number of obs = 4037 Number of PSUs = 4037 Population size = 7932333.9 Subpop. no. of obs = 1904 Subpop. size = 3780355.3 Design df = 4036


This is as it should be, since information about race is not required on those observations outside of the subpopulation. Remember, observations outside the subpopulation are relevant only insofar as they reflect the variability in the proportion(s) of sampled PSUs with at least one observation in the subpopulation.

In fact, at one point Stata did not behave properly in this regard; this was fixed in an update to Stata 10 on 02apr2008 (see -help whatsnew10- and search for "02apr2008").


-- Phil

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index