Sorry all, I understand Phil's point--they are added to the total population, not the subpopulation. The bit I pasted below was unintentional, I thought that it might have something to do with the missing data but no...then didn't delete it. Sorry! Cheers, Tim On Thu, Mar 18, 2010 at 9:14 PM, Tim Scharks <tim.scharks@gmail.com> wrote: > That sounds reasonable, Phil, but why are the WOMEN with (missing) for > race added to the sample when men==1 was specified? > > Thanks, Tim > > takes on values other than 0 and 1 > subpop() != 0 indicates subpopulation > (running tabulate on estimation sample) > > Number of strata = 1 Number of obs = 4000 > Number of PSUs = 4000 Population size = 7880496.9 > Subpop. no. of obs = 4000 > Subpop. size = 7880496.9 > Design df = 3999 > > ------------------------ > 1=white, | > 2=black, | > 3=other | count > ----------+------------- > White | 6,930,316.91 > Black | 754,879.69 > Other | 195,300.31 > | > Total | 7,880,496.91 > ------------------------ > Key: count = weighted counts > > . > end of do-file > > > > > On Thu, Mar 18, 2010 at 5:04 PM, Phil Schumm <pschumm@uchicago.edu> wrote: >> On Mar 18, 2010, at 6:20 PM, Michael Mitchell wrote: >>> >>> Here is the tabulation of race and sex by race. >> >> <snip> >> >>> . tab sex race, missing >>> >>> 1=male, | 1=white, 2=black, 3=other >>> 2=female | White Black Other . | Total >>> -----------+--------------------------------------------+---------- >>> male | 1,676 193 35 34 | 1,938 >>> female | 1,824 238 34 37 | 2,133 >>> -----------+--------------------------------------------+---------- >>> Total | 3,500 431 69 71 | 4,071 >> >> <snip> >> >>> But now I want to analyze just the sub-population of males (sex==1) and it >>> shows that the number of obs is now 4037 (see below). How can the number of >>> observations increase when adding a -subpop()- option? There are suddenly >>> 37 extra observations. Note this corresponds to the number of females with a >>> missing race. >>> >>> . svy , subpop(if sex==1): tab race, count format(%13.2fc) >>> (running tabulate on estimation sample) >>> >>> Number of strata = 1 Number of obs = >>> 4037 >>> Number of PSUs = 4037 Population size = >>> 7932333.9 >>> Subpop. no. of obs = >>> 1904 >>> Subpop. size = >>> 3780355.3 >>> Design df = >>> 4036 >> >> >> This is as it should be, since information about race is not required on >> those observations outside of the subpopulation. Remember, observations >> outside the subpopulation are relevant only insofar as they reflect the >> variability in the proportion(s) of sampled PSUs with at least one >> observation in the subpopulation. >> >> In fact, at one point Stata did not behave properly in this regard; this was >> fixed in an update to Stata 10 on 02apr2008 (see -help whatsnew10- and >> search for "02apr2008"). >> >> >> -- Phil >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ >> > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

