[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Joseph Coveney" <jcoveney@bigplanet.com> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
Re: st: multiple weights per person in GEE? |

Date |
Wed, 22 Jul 2009 00:05:55 +0900 |

Steve Samuels wrote: The WEIGHT used by SAS GENMOD is a dispersion parameter weight, which should be equivalent to Stata's -aweight-. Weights for IPTW should be proportional to probability weights, as in the reference Joseh quotes. Ariel did not show us her Stata commands and output, as the Statalist FAQ specify, but it's quite likely that she is comparing different things. -------------------------------------------------------------------------------- You make a good point. Without the specifics from Ariel it will be impossible to delve into what might be behind the large differences in the regression coefficient standard errors between SAS and Stata. I wonder whether it has much to do with weights, though. In the do-file below, which explores -glm- with varying within-panel weights, -pweight- (-iweight-) and -aweight- yield identical coefficients and standard errors. The same obtains with -iweight- and -pweight- with -xtgee- in the dataset having constant within-panel weights. (There is no -aweight- in -xtgee-.) More important, the M. A. Hernán in the _Stata Journal_ article that I cited is the same Miguel Hernán in the SAS literature that Ariel cites. I'm guessing that if there were a caveat arising from different meanings of "weight" between the two packages in this context, then it would have been brought out by the advocates of the marginal structural model approach (or the journal's reviewrs) who are familiar with both. I suspect that a comparison of the output (deviance and pseudolikelihood, coefficients and their standard errors, etc.) from following SAS code on the PROC IMPORTed CSV dataset file produced by the do-file below would indicate that SAS is handling the weights much as Stata does with -pweight- (-iweight-). PROC GENMOD DATA=DUMMY DESCENDING; CLASS PATIENT TREATMENT TIME; MODEL SCORE=TREATMENT TIME / DIST=BIN; REPEATED SUBJECT=PATIENT / CORR=IND; SCWGT WEIGHT; RUN; I'd wager that the coefficient standard errors would be essentially the same as those from Stata. They might differ in the third significant figure, or so, because of the convergence properties of the algorithms used by Stata and SAS for fitting these models, default tolerance settings, and so on. But I doubt that they would show the 0.484 versus 0.013 difference that Ariel reports in SEs. Like you say, without the specifics, we can only guess at the problem. In my experience, GLMs are very sensitive to collinearity among the predictors, for example. PROC GENMOD is reported* to be coy about convergence failures when it encounters compelete or quasicomplete separation with binomial data, for another example. It might pay Ariel to take a closer look at the SAS log (and at the iteration log in Stata) in the discrepant case if he hasn't already done so. Joseph Coveney * Paul D. Allison, Convergence Failures in Logistic Regression. Paper 360-2008. _SAS Global Forum 2008_ www2.sas.com/proceedings/forum2008/360-2008.pdf clear * set more off version 10.1 matrix define Mu = J(1, 5, 0.5) matrix define Sigma = J(5, 5, 0.5) + I(5) * 0.5 // findit ovbd // findit ridder ovbd , stub(score) means(Mu) corr(Sigma) /// seed(`=date("2009-07-21", "YMD")') n(100) clear generate byte treatment = 0 tempfile tmpfil0 save `tmpfil0' matrix input Mu = (0.4 0.45 0.5 0.55 0.6) ovbd , stub(score) means(Mu) corr(Sigma) n(100) clear generate byte treatment = 1 append using `tmpfil0' generate int patient = _n reshape long score, i(patient) j(time) generate double weight = runiform() outsheet patient treatment time score weight using dummy.csv, comma names insheet patient treatment time score weight using dummy.csv, comma names clear tabulate time, generate(time) tabulate treatment, generate(treatment) log using comparem.log, text glm score treatment1 time1-time4 [iweight=weight], cluster(patient) /// family(binomial) link(logit) nolog glm score treatment1 time1-time4 [aweight=weight], cluster(patient) /// family(binomial) link(logit) nolog glm score treatment1 time1-time4 [pweight=weight], cluster(patient) /// family(binomial) link(logit) nolog bysort patient (time): replace weight = weight[1] xtgee score treatment1 time1-time4 [iweight=weight], i(patient) /// family(binomial) link(logit) corr(independent) robust nolog xtgee score treatment1 time1-time4 [pweight=weight], i(patient) /// family(binomial) link(logit) corr(independent) robust nolog glm score treatment1 time1-time4 [pweight=weight], cluster(patient) /// family(binomial) link(logit) nolog log close exit * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: multiple weights per person in GEE?***From:*sjsamuels@gmail.com

**References**:**Re: st: multiple weights per person in GEE?***From:*"Ariel Linden" <ariel.linden@gmail.com>

**Re: st: multiple weights per person in GEE?***From:*"Joseph Coveney" <jcoveney@bigplanet.com>

**Re: st: multiple weights per person in GEE?***From:*sjsamuels@gmail.com

- Prev by Date:
**RE: st: correlation for longitudinal data** - Next by Date:
**st: Consequences of OLS with poisson distribution** - Previous by thread:
**Re: st: multiple weights per person in GEE?** - Next by thread:
**Re: st: multiple weights per person in GEE?** - Index(es):

© Copyright 1996–2017 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |