[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Ariel Linden" <ariel.linden@gmail.com> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
Re: st: multiple weights per person in GEE? |

Date |
Wed, 22 Jul 2009 08:56:53 -0700 |

Steve and Joseph, Thank you both for your continued devotion to this problem. Here is the code I used for the comparative models GLM in Stata vs GEE in SAS. **** GLM in Stata:**** . glm rx tx [aweight = iptw], family(gaussian) link(identity) vce(cluster personnumber) (sum of wgt is 4.1810e+05) Iteration 0: log pseudolikelihood = -459638.49 Generalized linear models No. of obs = 188831 Optimization : ML Residual df = 188829 Scale parameter = 7.617026 Deviance = 1438315.445 (1/df) Deviance = 7.617026 Pearson = 1438315.445 (1/df) Pearson = 7.617026 Variance function: V(u) = 1 [Gaussian] Link function : g(u) = u [Identity] AIC = 4.868274 Log pseudolikelihood = -459638.4948 BIC = -855694 (Std. Err. adjusted for 7868 clusters in personnumber) ---------------------------------------------------------------------------- -- | Robust rx | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+-------------------------------------------------------------- -- tx | 2.476136 .4841721 5.11 0.000 1.527176 3.425096 _cons | 1.145776 .0553768 20.69 0.000 1.03724 1.254313 ---------------------------------------------------------------------------- -- **** GEE in SAS:***** PROC GENMOD DATA=DATA; CLASS PERSONNU MONTH; MODEL RX = TX; WEIGHT IPTW; REPEATED subject=PERSONNU / type=exch; run; GEE (in SAS) IPTW Standard Wald 95% Confidence Chi- Parameter DF Estimate Error Limits Square Pr > ChiSq Intercept 1 1.1458 0.0094 1.1273 1.1642 14816.1 <.0001 SCCENROL 1 2.4761 0.0128 2.4511 2.5011 37695.3 <.0001 Scale 1 4.1067 0.0067 4.0936 4.1198 Date: Tue, 21 Jul 2009 12:22:35 -0400 From: sjsamuels@gmail.com Subject: Re: st: multiple weights per person in GEE? - -- Thank you, Joseph. I think that you are right. After I posted, I did some comparisons of aweights and pweights in -glm-. I saw one extreme case (weight = response) in which the standard errors differed by a factor of 2-4, but in the other comparisons, the standard errors were within 20% of each other. I saw nothing like Ariel's discrepancies. In the absence of a -vce- or -cluster- option, pweights lead to robust standard errors, whereas aweights lead to OIM (Observed Information Matrix) standard errors. Perhaps it's time for me to purchase Hardin and Hilbe's book on Generalized Linear Models. Here was the extreme case: sysuse auto,clear glm rep78 mpg [aweight=rep78], family(poisson) glm rep78 mpg [pweight=rep78], family(poisson) - -Steve On Tue, Jul 21, 2009 at 11:05 AM, Joseph Coveney<jcoveney@bigplanet.com> wrote: > Steve Samuels wrote: > > The WEIGHT used by SAS GENMOD is a dispersion parameter weight, which > should be equivalent to Stata's -aweight-. Weights for IPTW should > be proportional to probability weights, as in the reference Joseh > quotes. Ariel did not show us her Stata commands and output, as the > Statalist FAQ specify, but it's quite likely that she is comparing > different things. > > ---------------------------------------------------------------------- > ---------- > > You make a good point. Without the specifics from Ariel it will be > impossible to delve into what might be behind the large differences in > the regression coefficient standard errors between SAS and Stata. > > I wonder whether it has much to do with weights, though. In the > do-file below, which explores -glm- with varying within-panel weights, > -pweight- (-iweight-) and -aweight- yield identical coefficients and > standard errors. The same obtains with -iweight- and -pweight- with > -xtgee- in the dataset having constant within-panel weights. (There > is no -aweight- in -xtgee-.) > > More important, the M. A. Hernán in the _Stata Journal_ article that I > cited is the same Miguel Hernán in the SAS literature that Ariel > cites. I'm guessing that if there were a caveat arising from > different meanings of "weight" between the two packages in this > context, then it would have been brought out by the advocates of the > marginal structural model approach (or the journal's reviewrs) who are > familiar with both. I suspect that a comparison of the output > (deviance and pseudolikelihood, coefficients and their standard > errors, etc.) from following SAS code on the PROC IMPORTed CSV dataset > file produced by the do-file below would indicate that SAS is handling > the weights much as Stata does with > -pweight- (-iweight-). > > PROC GENMOD DATA=DUMMY DESCENDING; > CLASS PATIENT TREATMENT TIME; > MODEL SCORE=TREATMENT TIME / DIST=BIN; > REPEATED SUBJECT=PATIENT / CORR=IND; > SCWGT WEIGHT; > RUN; > > I'd wager that the coefficient standard errors would be essentially > the same as those from Stata. They might differ in the third > significant figure, or so, because of the convergence properties of > the algorithms used by Stata and SAS for fitting these models, default > tolerance settings, and so on. But I doubt that they would show the > 0.484 versus 0.013 difference that Ariel reports in SEs. > > Like you say, without the specifics, we can only guess at the problem. > In my experience, GLMs are very sensitive to collinearity among the > predictors, for example. PROC GENMOD is reported* to be coy about > convergence failures when it encounters compelete or quasicomplete > separation with binomial data, for another example. It might pay > Ariel to take a closer look at the SAS log (and at the iteration log in Stata) in the discrepant case if he hasn't already done so. > > Joseph Coveney > > * Paul D. Allison, Convergence Failures in Logistic Regression. Paper 360-2008. > _SAS Global Forum 2008_ > www2.sas.com/proceedings/forum2008/360-2008.pdf > > > clear * > set more off > version 10.1 > > matrix define Mu = J(1, 5, 0.5) > matrix define Sigma = J(5, 5, 0.5) + I(5) * 0.5 // findit ovbd // > findit ridder ovbd , stub(score) means(Mu) corr(Sigma) /// > seed(`=date("2009-07-21", "YMD")') n(100) clear generate byte > treatment = 0 > > tempfile tmpfil0 > save `tmpfil0' > > matrix input Mu = (0.4 0.45 0.5 0.55 0.6) ovbd , stub(score) means(Mu) > corr(Sigma) n(100) clear generate byte treatment = 1 append using > `tmpfil0' > > generate int patient = _n > reshape long score, i(patient) j(time) generate double weight = > runiform() > > outsheet patient treatment time score weight using dummy.csv, comma > names > > insheet patient treatment time score weight using dummy.csv, comma > names clear > > tabulate time, generate(time) > tabulate treatment, generate(treatment) > > log using comparem.log, text > > glm score treatment1 time1-time4 [iweight=weight], cluster(patient) > /// > family(binomial) link(logit) nolog > > glm score treatment1 time1-time4 [aweight=weight], cluster(patient) > /// > family(binomial) link(logit) nolog > > glm score treatment1 time1-time4 [pweight=weight], cluster(patient) > /// > family(binomial) link(logit) nolog > > bysort patient (time): replace weight = weight[1] > > xtgee score treatment1 time1-time4 [iweight=weight], i(patient) /// > family(binomial) link(logit) corr(independent) robust nolog > > xtgee score treatment1 time1-time4 [pweight=weight], i(patient) /// > family(binomial) link(logit) corr(independent) robust nolog > > glm score treatment1 time1-time4 [pweight=weight], cluster(patient) > /// > family(binomial) link(logit) nolog > > log close > > exit > > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search <http://www.stata.com/help.cgi?search> > * http://www.stata.com/support/statalist/faq <http://www.stata.com/support/statalist/faq> > * http://www.ats.ucla.edu/stat/stata/ <http://www.ats.ucla.edu/stat/stata/> > - -- Steven Samuels sjsamuels@gmail.com 18 Cantine's Island Saugerties NY 12477 USA 845-246-0774 * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: multiple weights per person in GEE?***From:*"Joseph Coveney" <jcoveney@bigplanet.com>

- Prev by Date:
**Re: st: Re: Prediction after xtpoisson, fe** - Next by Date:
**st: frequency bar charts of string variables** - Previous by thread:
**Re: st: multiple weights per person in GEE?** - Next by thread:
**Re: st: multiple weights per person in GEE?** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |