[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
sjsamuels@gmail.com |

To |
statalist@hsphsun2.harvard.edu |

Subject |
st: Re: |

Date |
Thu, 2 Jul 2009 20:28:25 -0400 |

Arnold, Arnold, I cannot tell why the SE's are so different. The n's and outcome means for the subpopulation size total for the "smkskul" are identical in all three analyses, so that is not a problem. I do see some issues. 1. In SAS. the variables in the CLUSTER statement should identify only the PSUs, the 1st stage units. This should, however, lead to smaller, rather than larger standard errors. 2. Stata thinks that there are 46 strata in the entire sample, but SAS thinks that there are 27. SUDAAN and SAS differ by about 1,000 in their report of the sample size for the original population. 3. The subpopulation seems confined to one PSU- one value of "skulid" - and one stratum,, but Stata says that there arere nine PSUs with observations in the subpopulation. Perhaps Stata considers the second stage units,, class rooms, as PSU's in this case, and the othefrs do not. If so, this could account for some of the discrepancy: between-classroom variation could be be small, if there are 16 individuals in nine classrooms. 4. The outcome, according to SUDAAN, is missing for 93% of the subpopulation sample. I suggest that you make sure that variables and observations are identical in the data sets (I notice two different weight variables); make sure that the cluster, classroom, and stratum counts agree in SAS and Stata. Rerun your analyses on this outcome and on one with no missing values and submit your findings to the group with a copy to Jeff Pitblado at Stata. Good luck! Steve On Tue, Jun 30, 2009 at 3:50 PM, Levinson, Arnold<Arnold.Levinson@ucdenver.edu> wrote: > Steve, > Sorry for overlooking the obvious. Here are the commands and output. (I note as usual the wonderful output efficiency of Stata over the others.) > arnold > _____________________ > *Stata* > svyset skulid [pw=w2f2f3], strata(strat) fpc(fpc) || classid > > pweight: w2f2f3 > VCE: linearized > Strata 1: strat > SU 1: skulid > FPC 1: fpc > Strata 2: <one> > SU 2: classid > FPC 2: <zero> > > . svy, subpop(if year==2008 & skulid==80001): mean smkskul > (running mean on estimation sample) > > Survey: Mean estimation > > Number of strata = 1 Number of obs = 131 > Number of PSUs = 9 Population size = 783.698 > Subpop. no. obs = 16 > Subpop. size = 120.542 > Design df = 8 > > -------------------------------------------------------------- > | Linearized > | Mean Std. Err. [95% Conf. Interval] > -------------+------------------------------------------------ > smkskul | .5806258 .014649 .5468452 .6144064 > -------------------------------------------------------------- > Note: 45 strata omitted because they contain no subpopulation members > > ___________________ > SAS: > PROC SURVEYMEANS DATA = ytabstest RATE = FPC; > VAR SMKSKUL; > STRATA STRAT; > CLUSTER SKULID CLASSID; > WEIGHT SKULWT; > DOMAIN skulstrat; > RUN; > > The SAS System 08:13 Tuesday, June 30, 2009 315 > > The SURVEYMEANS Procedure > > Data Summary > Number of Strata 27 > Number of Clusters 1282 > Number of Observations 21212 > Sum of Weights 98864 > > Statistics > Std Error > Variable Label N Mean of Mean 95% CL for Mean > ャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャ SMKSKUL SMKSKUL 1706 0.488438 0.015833 0.45735470 0.51952078 > ャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャ > > Domain Analysis: skulstrat > > Std Error > skulstrat Variable Label N Mean of Mean 95% CL for Mean > ャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャ 0 SMKSKUL SMKSKUL 1690 0.487015 0.016001 0.45560287 0.51842627 > 1 SMKSKUL SMKSKUL 16 0.580626 0.104178 0.37624423 0.78500743 > ャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャャ > > The SAS System 08:13 Tuesday, June 30, 2009 316 > > > PROC DESCRIPT DATA = ytabstest DESIGN = WOR; > NEST STRAT SKULID CLASSID / MISSUNIT; > TOTCNT TOTSAMP _MINUS1_ _MINUS1_; > VAR SMKSKUL; > CLASS SMKSKUL; > WEIGHT SKULWT; > SUBPOPN skulstrat = 1; > RUN; > > S U D A A N > Software for the Statistical Analysis of Correlated Data > Copyright Research Triangle Institute August 2008 > Release 10.0 > > > DESIGN SUMMARY: Variances will be computed using the Taylor Linearization Method, Assuming a > Without Replacement (WOR) Design > Sample Weight: SKULWT > Stage 1 Stratification Variable: STRAT > Stage 1 Population Count Variable: TOTSAMP > Stage 2 NEST Variable: SKULID (stage type is data dependent) > Stage 2 Population Count Variable: _MINUS1_ > Stage 3 With Replacement Sampling Variable: CLASSID > Stage 3 Population Count Variable: _MINUS1_ > > > Number of observations read : 20434 Weighted count : 97843 > Observations in subpopulation : 226 Weighted count : 1650 > Denominator degrees of freedom : 128 > Date: 06-30-2009 SUDAAN Page: 1 > Time: 13:38:12 Table: 1 > > Frequencies and Values for CLASS Variables > by: SMKSKUL. > > ---------------------------------- > SMKSKUL Frequency Value > ---------------------------------- > Ordered > Position: > 1 6 0 > Ordered > Position: > 2 10 1 > ---------------------------------- > > > Date: 06-30-2009 SUDAAN Page: 2 > Time: 13:38:12 Table: 1 > > Variance Estimation Method: Taylor Series (WOR) > For Subpopulation: SKULSTRAT = 1 > by: Variable, SUDAAN Reserved Variable One. > > -------------------------------------------------------------------- > | | | SUDAAN Reserved Variable | > | Variable | | One | > | | |-----------------------------| > | | | Total | 1 | > -------------------------------------------------------------------- > | | | | | > | SMKSKUL | Sample Size | 16 | 16 | > | | Weighted Size | 120.54 | 120.54 | > | | Total | 69.99 | 69.99 | > | | Lower 95% Limit | | | > | | Total | -39.85 | -39.85 | > | | Upper 95% Limit | | | > | | Total | 179.83 | 179.83 | > | | Mean | 0.58063 | 0.58063 | > | | SE Mean | 0.09 | 0.09 | > | | Lower 95% Limit | | | > | | Mean | 0.39690 | 0.39690 | > | | Upper 95% Limit | | | > | | Mean | 0.76435 | 0.76435 | > -------------------------------------------------------------------- > >> On Tue, Jun 30, 2009 at 12:41 PM, Levinson, >> Arnold<Arnold.Levinson@ucdenver.edu> wrote: >>> Survey analysis experts: >>> I have data from a stratified two-stage school survey. The first stage sampled schools within strata, the second sampled classrooms within selected schools. >>> >>> When estimating variables of interest at the school level, I get hugely different variance estimates running Stata vs. SAS or SUDAAN. Stata's estimates are generally a lot smaller than SAS's or SUDAAN's, and the latter to are similar or identical to each other. -- Steven Samuels sjsamuels@gmail.com 18 Cantine's Island Saugerties NY 12477 USA 845-246-0774 * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**[no subject]***From:*"Levinson, Arnold" <Arnold.Levinson@ucdenver.edu>

- Prev by Date:
**RE: st: RE: Hausman test for clustered random vs. fixed effects (again)** - Next by Date:
**Re: st: AW: Displaying multiple indicators for esttab** - Previous by thread:
**[no subject]** - Next by thread:
**st: Obtaining partial effects for mean centred interaction term** - Index(es):

© Copyright 1996–2017 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |