# st: glm and reg produce different results for loglinear model?

 From "Matthew Mercurio (matthewmercurio)" To statalist@hsphsun2.harvard.edu Subject st: glm and reg produce different results for loglinear model? Date Fri, 31 Oct 2008 15:43:20 -0700

```I have two variables,

(1) outagecost (estimated costs to each customer of a short electrical
power interuuption)
(2) mwhannual (annual megawatt hours of electricity consumption fpr each
customer)

Since these variables appear approximately lognormal, I have been
estimating the following simple model:

reg lnoutagecost lnmwhannual

where lnoutagecost and lnmwhannual represent the natural log of the two
variables desribed above.  The results are:

. reg lnoutagecost lnmwhannual

Source |       SS       df       MS           Number of obs =
32345
-------------+------------------------------        F(  1, 32343) =
9370.20
Model |  34151.9301     1  34151.9301        Prob > F      =
0.0000
Residual |  117881.722 32343   3.6447368        R-squared     =
0.2246
0.2246
Total |  152033.652 32344  4.70052104        Root MSE      =
1.9091
------------------------------------------------------------------------
----
lnoutagecost |      Coef.   Std. Err.      t    P>|t|   [95% Conf.
Interval]
-------------+----------------------------------------------------------
----
lnmwhannual |   .3824726   .0039512    96.80   0.000   .3747282
.3902171
_cons |   5.370938   .0232302   231.21   0.000   5.325406
5.41647
------------------------------------------------------------------------
----

I then tried the following model in glm which I had expected to produce
identical results:

Generalized linear models                        No. of obs      =
52418
Optimization     : ML                            Residual df     =
52416
Scale parameter =
7.59e+09
Deviance         =  3.97873e+14                  (1/df) Deviance =
7.59e+09
Pearson          =  3.97873e+14                  (1/df) Pearson  =
7.59e+09
Variance function: V(u) = 1                        [Gaussian]
Link function    : g(u) = ln(u)                    [Log]
AIC            =
25.5881
Log likelihood   = -670636.5416                   BIC            =
3.98e+14
------------------------------------------------------------------------
----
|                 OIM
outagecost |      Coef.   Std. Err.      z    P>|z|   [95% Conf.
Interval]
-------------+----------------------------------------------------------
----
lnmwhannual |   .5568004   .0130092    42.80   0.000   .5313029
.5822979
_cons |   5.355758   .1384432    38.69   0.000   5.084414
5.627102
------------------------------------------------------------------------
----

Obviously the results are very similar, but not identical.

I read the Stata Manual section on GLM and checked a large number of
posts on Statalist related to loglinear models, but I was not able to
understand exactly why glm using link(log) doesn't produce the same
results as logging both variables and using reg.   Based on my reading
of the Stata manual it appears to have someing to do with the fact that
the link() option relates to the expectation od the dependent variable,
not the dependent variable itself.  Can anyone tell me why the results
are different?

Matthew G. Mercurio, Ph.D.
Senior Consultant
Freeman, Sullivan & Co.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```