[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
jpitblado@stata.com (Jeff Pitblado, StataCorp LP) |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Svy mean using subpop and incorrect number of observations |

Date |
Thu, 17 Dec 2009 15:23:53 -0600 |

Heather E. Ridolfo <evd7@CDC.GOV> asks about a Statalist exchange we had in July of last year: > I posted a message in July 2008 about problems I was encountering when > using svy mean: > "Using svy: mean- with option -subpop() I noticed that it is reporting a > smaller estimation sample than the number of observations in my > dataset." > > The reply I got back said: > "We have verified that -svy: mean- is incorrectly dropping out-of-subpop > observations that contain missing values in the variables of the > varlist. The only other affected commands are -svy: proportion-, -svy: > ratio-, and -svy: total-. We hope to have this fixed in the next Stata > update (within the next few weeks)" > > However, I continue to experience this problem a year and half later > when trying to run the following command: > Svyset PSU [pweight = nweight], strata(STRATUM) singleunit(centered) > Svy, subpop(allsp): mean RA sevimpft ADLS IADLS help UseAD > > The number of observations I get back is smaller than the number of > actual observation in the dataset. I am using Stata 10 and as far as I > can tell it's up-to-date. > > Does anyone have any suggestions on how I can fix this problem? In the Stata 10 whatsnew, the update on 18aug2009 contains the following item: 48. svy: mean, svy: proportion, svy: ratio, and svy: total would mark out observations with missing values in the summary variables even when the sampling weight was zero, which is a surrogate for identifying out-of-subpopulation observations. This has been fixed. Given Heather's example, -svy- will drop observations containing missing values in any of the following variables: PSU nweight STRATUM -svy- will then only check the following variables for missing values within the subpopulation observations: RA sevimpft ADLS IADLS help UseAD The following simple example illustrates that -svy- is only dropping observations with missing values within the subpopulation. . sysuse auto . tabulate rep78 foreign, missing nolabel . svyset _n . svy, subpop(if for==0): mean rep78 . svy, subpop(if for==1): mean rep78 In the following output from Stata 10, -tabulate- shows that -rep78- is missing in 5 observatsion, 4 observations where foreign=0 and 1 observation where foreign=1. The two calls to -svy: mean- show that the sample size is 70 and 73, respectively. ***** BEGIN: . sysuse auto (1978 Automobile Data) . tabulate rep78 foreign, missing nolabel Repair | Record | Car type 1978 | 0 1 | Total -----------+----------------------+---------- 1 | 2 0 | 2 2 | 8 0 | 8 3 | 27 3 | 30 4 | 9 9 | 18 5 | 2 9 | 11 . | 4 1 | 5 -----------+----------------------+---------- Total | 52 22 | 74 . svyset _n pweight: <none> VCE: linearized Single unit: missing Strata 1: <one> SU 1: <observations> FPC 1: <zero> . svy, subpop(if for==0): mean rep78 (running mean on estimation sample) Survey: Mean estimation Number of strata = 1 Number of obs = 70 Number of PSUs = 70 Population size = 70 Subpop. no. obs = 48 Subpop. size = 48 Design df = 69 -------------------------------------------------------------- | Linearized | Mean Std. Err. [95% Conf. Interval] -------------+------------------------------------------------ rep78 | 3.020833 .1205044 2.780434 3.261233 -------------------------------------------------------------- . svy, subpop(if for==1): mean rep78 (running mean on estimation sample) Survey: Mean estimation Number of strata = 1 Number of obs = 73 Number of PSUs = 73 Population size = 73 Subpop. no. obs = 21 Subpop. size = 21 Design df = 72 -------------------------------------------------------------- | Linearized | Mean Std. Err. [95% Conf. Interval] -------------+------------------------------------------------ rep78 | 4.285714 .1537776 3.979164 4.592264 -------------------------------------------------------------- ***** END: --Jeff jpitblado@stata.com * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**st: Using do files in Windows 7 and XP***From:*kokootchke <kokootchke@hotmail.com>

- Prev by Date:
**Re: st: clogit, "initial values not feasible" error** - Next by Date:
**st: Using do files in Windows 7 and XP** - Previous by thread:
**st: Svy mean using subpop and incorrect number of observations** - Next by thread:
**st: Using do files in Windows 7 and XP** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |