# RE: st: glm and reg produce different results for loglinear model?

 From jverkuilen To Subject RE: st: glm and reg produce different results for loglinear model? Date Mon, 3 Nov 2008 10:27:18 -0500

Log link and log transform are not the same model.

On what scale do you expect the errors to operate? Log transform implies lognormal errors which are multiplicative. Log link is log in expected value but additive in errors.

-----Original Message-----
From: "Matthew Mercurio (matthewmercurio)" <matthewmercurio@fscgroup.com>
To: statalist@hsphsun2.harvard.edu
Sent: 10/31/2008 5:43 PM
Subject: st: glm and reg produce different results for loglinear model?

I have two variables,

(1) outagecost (estimated costs to each customer of a short electrical
power interuuption)
(2) mwhannual (annual megawatt hours of electricity consumption fpr each
customer)

Since these variables appear approximately lognormal, I have been
estimating the following simple model:

reg lnoutagecost lnmwhannual

where lnoutagecost and lnmwhannual represent the natural log of the two
variables desribed above.  The results are:

. reg lnoutagecost lnmwhannual

Source |       SS       df       MS           Number of obs =
32345
-------------+------------------------------        F(  1, 32343) =
9370.20
Model |  34151.9301     1  34151.9301        Prob > F      =
0.0000
Residual |  117881.722 32343   3.6447368        R-squared     =
0.2246
0.2246
Total |  152033.652 32344  4.70052104        Root MSE      =
1.9091
------------------------------------------------------------------------
----
lnoutagecost |      Coef.   Std. Err.      t    P>|t|   [95% Conf.
Interval]
-------------+----------------------------------------------------------
----
lnmwhannual |   .3824726   .0039512    96.80   0.000   .3747282
.3902171
_cons |   5.370938   .0232302   231.21   0.000   5.325406
5.41647
------------------------------------------------------------------------
----

I then tried the following model in glm which I had expected to produce
identical results:

Generalized linear models                        No. of obs      =
52418
Optimization     : ML                            Residual df     =
52416
Scale parameter =
7.59e+09
Deviance         =  3.97873e+14                  (1/df) Deviance =
7.59e+09
Pearson          =  3.97873e+14                  (1/df) Pearson  =
7.59e+09
Variance function: V(u) = 1                        [Gaussian]
Link function    : g(u) = ln(u)                    [Log]
AIC            =
25.5881
Log likelihood   = -670636.5416                   BIC            =
3.98e+14
------------------------------------------------------------------------
----
|                 OIM
outagecost |      Coef.   Std. Err.      z    P>|z|   [95% Conf.
Interval]
-------------+----------------------------------------------------------
----
lnmwhannual |   .5568004   .0130092    42.80   0.000   .5313029
.5822979
_cons |   5.355758   .1384432    38.69   0.000   5.084414
5.627102
------------------------------------------------------------------------
----

Obviously the results are very similar, but not identical.

I read the Stata Manual section on GLM and checked a large number of
posts on Statalist related to loglinear models, but I was not able to
understand exactly why glm using link(log) doesn't produce the same
results as logging both variables and using reg.   Based on my reading
of the Stata manual it appears to have someing to do with the fact that
the link() option relates to the expectation od the dependent variable,
not the dependent variable itself.  Can anyone tell me why the results
are different?

Matthew G. Mercurio, Ph.D.
Senior Consultant
Freeman, Sullivan & Co.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/