Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Steve Samuels <sjsamuels@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: st: Stata coding to SAS |

Date |
Wed, 23 Jan 2013 09:28:24 -0500 |

Reading the -help- for -svyset-, or my previous post would have told you what the postwt() variable should be and how to convert the Stata variable to a formal probability weight for SAS. And, as error message you reported separately to the indicated, the postwt() variable must have the same value for everyone in the same poststratum. So giving cases a constant value is incorrect. Another issue is that that SAS is seeing 5 more observations than Stata--you will have to track this one down. But you have other problems: Re-weighting by deprivation will lead to biased estimates of associations with deprivation or with related factors. Moreover, the point of matching as you did it is to make control distributions resemble case distributions, not vice-versa. As you don't really have a probability sample, use of the subpop() option is unnecessary. A good sampling reference to is Sharon Lohr: Sampling Design and Analysis, 2009, Cengage. I'm curious: Where are you working or studying? Steve Jan 21, 2013, at 11:36 PM, K C Wong wrote: Thanks Steven and Stas for your kind reply. And my apology for my previous uninformative post. The study's a multi-ethnic, age- and ethnicity-matched population case-control study. A weight was calculated for each stratum of ethnicity(let say European, Asian and Indian) * deprivation(5-category), by dividing the expected deprivation distribution of each ethnic group by the observed deprivation distribution in the controls from our study because of the low response rates and differential non-response by deprivation quintile. The expected distributions were estimated to be xx%, ....., xx% for European and xx%,...., xx% for both Asian and Indian. I have a total of 4342 respondents. The Stata weighting coding is done as below: . gen aweight=1 if case==0 . replace aweight = 0.881 if ethnicity==0 & deprivation==0 & case==0 . replace aweight = 0.813 if ethnicity==0 & deprivation==1 & case==0 . replace aweight = 0.962 if ethnicity==0 & deprivation==2 & case==0 . replace aweight = 1.105 if ethnicity==0 & deprivation==3 & case==0 . replace aweight = 1.667 if ethnicity==0 & deprivation==4 & case==0 . replace aweight = 0.881 if ethnicity==0 & deprivation==. & case==0 . replace aweight = 0.187 if ethnicity==1 & deprivation==0 & case==0 . replace aweight = 0.190 if ethnicity==1 & deprivation==1 & case==0 . replace aweight = 0.565 if ethnicity==1 & deprivation==2 & case==0 . replace aweight = 0.833 if ethnicity==1 & deprivation==3 & case==0 . replace aweight = 2.044 if ethnicity==1 & deprivation==4 & case==0 . replace aweight = 0.435 if ethnicity==2 & deprivation==0 & case==0 . replace aweight = 0.323 if ethnicity==2 & deprivation==1 & case==0 . replace aweight = 0.926 if ethnicity==2 & deprivation==2 & case==0 . replace aweight = 0.948 if ethnicity==2 & deprivation==3 & case==0 . replace aweight = 1.236 if ethnicity==2 & deprivation==4 & case==0 . replace aweight = 1.236 if ethnicity==2 & deprivation==. & case==0 . replace aweight = 1 if case==1 /* since the weighting is only for controls */ . gen poststrata=0 . recode poststrata 0=1 if ethnicity==0 & nzdepgrp==0 & case==0 . recode poststrata 0=2 if ethnicity==0 & nzdepgrp==1 & case==0 . recode poststrata 0=3 if ethnicity==0 & nzdepgrp==2 & case==0 . recode poststrata 0=4 if ethnicity==0 & nzdepgrp==3 & case==0 . recode poststrata 0=5 if ethnicity==0 & nzdepgrp==4 & case==0 . recode poststrata 0=. if ethnicity==0 & nzdepgrp==. & case==0 . recode poststrata 0=6 if ethnicity==1 & nzdepgrp==0 & case==0 . recode poststrata 0=7 if ethnicity==1 & nzdepgrp==1 & case==0 . recode poststrata 0=8 if ethnicity==1 & nzdepgrp==2 & case==0 . recode poststrata 0=9 if ethnicity==1 & nzdepgrp==3 & case==0 . recode poststrata 0=10 if ethnicity==1 & nzdepgrp==4 & case==0 . recode poststrata 0=11 if ethnicity==2 & nzdepgrp==0 & case==0 . recode poststrata 0=12 if ethnicity==2 & nzdepgrp==1 & case==0 . recode poststrata 0=13 if ethnicity==2 & nzdepgrp==2 & case==0 . recode poststrata 0=14 if ethnicity==2 & nzdepgrp==3 & case==0 . recode poststrata 0=15 if ethnicity==2 & nzdepgrp==4 & case==0 . recode poststrata 0=. if ethnicity==2 & nzdepgrp==. & case==0 . recode poststrata 0=16 if case==1 . gen european=0 . recode european0=1 if eth_new==1 . gen asian=0 . recode asian 0=1 if eth_new==2 . gen indian=0 . recode indian 0=1 if eth_new==0 . svyset _n, poststrata(poststrata) postweight(aweight) . svy, subpop(european): logistic case i.age_cat i.interviewmethod i.status i.deprivation (running logistic on estimation sample) Survey: Logistic regression Number of strata = 1 Number of obs = 4337 Number of PSUs = 4337 Population size = 14.996 N. of poststrata = 17 Subpop. no. of obs = 259 Subpop. size = 3.9069105 Design df = 4336 F( 10, 4327) = 1.92 Prob > F = 0.0382 SAS: proc surveylogistic data=bc; strata poststrata; weight aweight; domain european; class age_cat(ref="0") interviewmethod(ref="3") deprivation(ref="0") / param=ref; model case(event="1") = age_cat interviewmethod status deprivation; run; Domain Summary Number of Observations 4342 Number of Observations in Domain 264 Number of Observations not in Domain 4078 Sum of Weights in Domain 267.82300 Variance Estimation Method Taylor Series Variance Adjustment Degrees of Freedom (DF) Number of Observations Read 4342 Number of Observations Used 4331 Sum of Weights Read 267.823 Sum of Weights Used 261.643 I understand that I have not done it right for my lack of understanding in weighting. I'd be greatly appreciated if anyone could further shed some light on this. Once again, thanks Steven and Stas. On Thu, Jan 17, 2013 at 6:17 AM, Steve Samuels <sjsamuels@gmail.com> wrote: > I agree with Stas's diagnosis. > > Statum h: n_h subjects N_h population total > > In Stata: post weights are N_h, but normalized are: N_h/n_h, so sum to > N_h in the stratum > > In SAS: probability weights are N_h, so sum to n_h x N_h in the stratum. > > To get corresponding weights in SAS, KC could create "probability > weights" for SAS as wt_h = N_h/n_h. > > But post-strata are technically subpopulations (also known as "domains"), so, > depending on sample size, the standard errors given by SAS could be > wrong. There's also a question of whether reference groups are the same for > the predictor with reference group "3" in SAS. > > > KC specifies a subpopulation "european" in his analyses. > Post-stratifying with a subpopulation can give poor results if > population and subpopulation stratum proportions are very different. In > such a case KC could be better off not post-stratifying.KC's sample > might not have been drawn SRS, so that even Stata might not be giving > proper standard errors. All in all, I'd like to see more information and the > results as I requested. > > > Steven Samuels > Consulting Statistician > 18 Cantine's Island > Saugerties, NY 12477 USA > 845-246-0774 > > > > > On Jan 16, 2013, at 10:57 AM, Stas Kolenikov wrote: > > Stata's -postweight()- is the target sum of weights for a given > poststrata, rather than a weight variable as you specified for SAS. > Besides, I am not sure SAS supports poststratification, unlike Stata. > > > -- > -- Stas Kolenikov, PhD, PStat (SSC) :: http://stas.kolenikov.name > -- Senior Survey Statistician, Abt SRBI :: work email kolenikovs at > srbi dot com > -- Opinions stated in this email are mine only, and do not reflect the > position of my employer > > On Wed, Jan 16, 2013 at 9:05 AM, Steve Samuels <sjsamuels@gmail.com> wrote: >> KC >> >> You will have a better chance of getting an answer to your question if, >> as the FAQ request, you show the results of all commands. >> >> Also note that the one "rule" on Statalist is to use full names. >> If you are professionally known as "KC Wong", then please give your >> affiliation. >> >> Steve >> >> >> Steven Samuels >> Consulting Statistician >> 18 Cantine's Island >> Saugerties, NY 12477 USA >> 845-246-0774 >> >> >> On 12 January 2013 KC Wong wrote: >> >> >>> I wish to translate the below Stata coding to SAS and I'm wondering if >>> I have the SAS coding right because the result from Stata differs from >>> SAS's. >>> I'm now using StataIC 11. >>> >>> Stata: >>> svyset _n, poststrata(poststrata) postweight(aweight) >>> svy, subpop(european): logistic case i.age_cat i.interviewmethod >>> i.status i.deprivation >>> >>> SAS: >>> proc surveylogistic data=bc; >>> strata poststrata; >>> weight aweight; >>> domain european; >>> class age_cat(ref="0") interviewmethod(ref="3") deprivation(ref="0") >>> / param=ref; >>> model case(event="1") = age_cat interviewmethod status deprivation; >>> run; >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> * http://www.ats.ucla.edu/stat/stata/ > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Calculating and interpreting effect size when DV is a proportion***From:*Michelle Dynes <dynes.michelle@gmail.com>

**Re: st: Calculating and interpreting effect size when DV is a proportion***From:*Jeffrey Wooldridge <jmwooldridge60@gmail.com>

**Re: st: Calculating and interpreting effect size when DV is a proportion***From:*Maarten Buis <maartenlbuis@gmail.com>

**Re: st: Calculating and interpreting effect size when DV is a proportion***From:*Michelle Dynes <dynes.michelle@gmail.com>

**Re: st: Calculating and interpreting effect size when DV is a proportion***From:*Jeffrey Wooldridge <jmwooldridge60@gmail.com>

**Re: st: Calculating and interpreting effect size when DV is a proportion***From:*Michelle Dynes <dynes.michelle@gmail.com>

**Re: st: st: Stata coding to SAS***From:*Steve Samuels <sjsamuels@gmail.com>

**Re: st: st: Stata coding to SAS***From:*Stas Kolenikov <skolenik@gmail.com>

**Re: st: st: Stata coding to SAS***From:*Steve Samuels <sjsamuels@gmail.com>

**Re: st: st: Stata coding to SAS***From:*K C Wong <kcwong5@gmail.com>

- Prev by Date:
**Re: st: Quai Maximum likelihood and multipul imputation** - Next by Date:
**Re: st: extracting of data** - Previous by thread:
**Re: st: st: Stata coding to SAS** - Next by thread:
**Re: st: Calculating and interpreting effect size when DV is a proportion** - Index(es):