Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: subpop and the mysterious sample size


From   jpitblado@stata.com (Jeff Pitblado, StataCorp LP)
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: subpop and the mysterious sample size
Date   Fri, 19 May 2006 09:14:54 -0500

ROBERT BOZICK <rbozick1@jhem.jhu.edu> is concerned abou the 'Number of obs'
output in the header of -svymean- (replaced by -svy: mean- in Stata 9):

> I am working on a project where I need to use survey commans to estimate the
> standard error correctly --- my sample uses a stratified cluster design.   I
> created a variable called samp to indicate the analytic sample.  When samp =
> 1, then the respondent will be included in the analysis; when samp = 0 then
> the respondent will not be included in the analysis.  
> 
> The frequency of samp is shown below: 
> 
> tab samp 
> 
>        samp |      Freq.     Percent        Cum. 
> ------------+----------------------------------- 
>           0 |      6,917       42.25       42.25 
>           1 |      9,456       57.75      100.00 
> ------------+----------------------------------- 
>       Total |     16,373      100.00 
> 
> As you can see, there should be 9,456 in my analysis.
> When I use the svy commands to estimate means for my analytic sample using
> the subpop command,  the output reports that there are 15,548 used in the
> analysis.  Intuitively, that cannot be correct.  Does anyone know what is
> going on here?  How can I fix this so that it reports 9,456 instead of
> 15,548?  Thanks!
> 
> svymean var1, subpop(samp)  
> 
> Note: 11 strata omitted because they contain no subpopulation members
> 
> Survey mean estimation
> pweight:  f1pnlwt                                 Number of obs    =     15548
> Strata:   strat_id                                Number of strata =       350
> PSU:      psu                                     Number of PSUs   =       729
> Subpop.:  samp==1                                 Population size  = 3312561.5
> ------------------------------------------------------------------------------
>     Mean |   Estimate    Std. Err.   [95% Conf. Interval]        Deff
> ---------+--------------------------------------------------------------------
>     var1 |   46.72176    .2878163    46.15584    47.28768    4.473885
> ------------------------------------------------------------------------------

In Stata 8, unlike -svyregress-, -svymean- does not display the subpopulation
sample size.  However Robert could use -svyregress- to compute a subpopulation
mean

	. svyregress var1, subpop(samp)

and have the subpopulation size and its sample size get reported in the
header.  Note that the subpopulation mean estimate and its standard error are
the same whether you use -svymean- or -svyregress-.

In Stata 9, all the -svy- estimation commands have a unified header, and this
header includes the subpopulation size and its sample size too.  The Stata 9
syntax, using Robert's example is

	. svy, subpop(samp) : mean var1

--Jeff
jpitblado@stata.com
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index