Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: [SPAM] RE: st: Question regarding linktest and its implementation


From   Phil Schumm <[email protected]>
To   [email protected]
Subject   Re: [SPAM] RE: st: Question regarding linktest and its implementation
Date   Mon, 7 Aug 2006 10:23:15 -0500

On Aug 5, 2006, at 7:50 PM, Daniel Schneider wrote:
I have been using the linktest command after GLM models quite frequently, and I came to wonder whether it is implemented correctly or not.
<snip>

Is it possible that the revision of GLM models for the more recent stata version has changed some of the meaning of predictions and is not correctly implemented in the linktest command that seems to be based on an older Stata version?
On Aug 6, 2006, at 12:33 PM, Daniel Schneider wrote:
This is the replication of the linktest. Note that the results are similar but not identical because linktest seems to rely on an older version of GLM.

The test implemented by -linktest- (known as Tukey's one-degree-of- freedom test for non-additivity) is a test of the null hypothesis that the effects of the covariates are additive against alternatives in which they are not (the introduction of the square of the linear predictor to the model may be thought of as capturing a range of such alternatives). Since one of the purposes of the link function in a GLM is to linearize the effect of the covariates, in this context, it may also be thought of as a test of the adequacy of the link function. This is discussed in detail by Pregibon (1980), who proposed a slightly different approach; the entry under [R] - linktest- provides the full references to Pregibon (1980) and to Tukey's original 1949 paper, and explains why Stata uses Tukey's original formulation. I strongly suggest reading Pregibon's paper, which provides a nice way to think about what is going on in the GLM context.

Your posting also raised another (minor) issue worth commenting on. In particular, you demonstrated the following


. sysuse auto
(1978 Automobile Data)

. glm price trunk foreign weight, fam(gam)
<output omitted>

. predict xb, xb

. gen xb2 = xb^2

. linktest, fam(gam) nolog

Residual df = 71 No. of obs = 74
Pearson X2 = 5.113241 Deviance = 4.375989
Dispersion = .0720175 Dispersion = .0616336

Gamma distribution, power link (power = -1)
------------------------------------------------------------------------ ------
price | Coef. Std. Err. z P>|z| [95% Conf. Interval]
------------- +----------------------------------------------------------------
_hat | 1.333585 .3680821 3.62 0.000 . 6121571 2.055012
_hatsq | -1117.429 1200.049 -0.93 0.352 -3469.481 1234.624
_cons | -.0000214 .0000261 -0.82 0.412 -. 0000726 .0000298
------------------------------------------------------------------------ ------

. glm price xb xb2, fam(gam) nohead nolog

------------------------------------------------------------------------ ------
| OIM
price | Coef. Std. Err. z P>|z| [95% Conf. Interval]
------------- +----------------------------------------------------------------
xb | 1.333624 .3679106 3.62 0.000 . 6125322 2.054715
xb2 | -1117.528 1199.493 -0.93 0.352 -3468.491 1233.436
_cons | -.0000214 .0000261 -0.82 0.412 -. 0000726 .0000297
------------------------------------------------------------------------ ------


noting that the results obtained by manually regressing price on xb and xb2 are slightly different from those obtained via -linktest-. A quick look at the code for -linktest- (type -viewsource linktest.ado-) reveals why; namely, there is a -version 6- at the top, which means that when the model including _hatsq is fit, Stata is running under version control as if it were version 6. Under version 6, -glm- used Iteratively Reweighted Least Squares (IRLS) and reported standard errors based on the expected information matrix. One of the changes made in version 7 was to make use of Stata's -ml- the default, as well as reporting standard errors based on the observed information matrix. Thus, to replicate the results of - linktest-, you must use the -irls- option:


. glm price xb xb2, fam(gam) nohead nolog irls

------------------------------------------------------------------------ ------
| EIM
price | Coef. Std. Err. z P>|z| [95% Conf. Interval]
------------- +----------------------------------------------------------------
xb | 1.333585 .3680821 3.62 0.000 . 6121571 2.055012
xb2 | -1117.429 1200.049 -0.93 0.352 -3469.481 1234.624
_cons | -.0000214 .0000261 -0.82 0.412 -. 0000726 .0000298
------------------------------------------------------------------------ ------


Given that -glm- now uses -ml- by default, this behavior seems a bit inconsistent (though, admittedly, it is a minor issue).


-- Phil

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index