.-
help for ^glm^                                                 (manual:  ^[R] glm^)
.-

Generalized linear models
-------------------------

	^glm^ depvar [varlist] [weight] [^if^ exp] [^in^ range] [^,^ ^f^amily^(^familyname^)^
		^l^ink^(^linkname^)^ ^nocons^tant ^s^cale^(x2^|^dev^|#^)^ [^ln^]^o^ffset^(^varname^)^
		^disp(^#^)^ ^ef^orm ^le^vel^(^#^)^ ^it^erate^(^#^)^ ^lt^ol^(^#^)^ ^ini^t^(^varname^)^ ^nolo^g ]

where familyname is one of

	^gau^ssian |  ^ig^aussian     |  ^b^inomial [varname|#]  |
	^p^oisson  |  ^nb^inomial [#] |  ^gam^ma

and linkname is one of

	^i^dentity |  ^log^     |  ^l^ogit    |  ^p^robit  |  ^c^loglog  |
	^opo^wer # |  ^pow^er # |  ^nb^inomial | ^h^t varname


^aweight^s, ^fweight^s, and ^iweight^s are allowed; see help @weights@.

^glm^ shares the features of all estimation commands; see help @est@.

^glm^ may be used with ^sw^ to perform stepwise estimation; see help @sw@.


The syntax of @predict@ after ^glm^ is

	^predict^ [type] newvarname [^if^ exp] [^in^ range] [^,^ statistic ^nooff^set]

where statistic is

	^m^u            predicted mean of y = g_inverse(xb); the default
	^xb^            linear prediction
	^stdp^          standard error of the linear prediction
	^d^eviance      deviance residual
	^p^earson       Pearson residual

These statistics are available both in and out of sample; type "^predict^ ...
^if e(sample)^ ..." if wanted only for the estimation sample.


Description
-----------

^glm^ fits generalized linear models.


Options
-------

^family(^familyname^)^ specifies the distribution of depvar; ^family(gaussian)^ is
    the default.

^link(^linkname^)^ specifies the link function; the default is the canonical link
    for the ^family()^ specified.

^noconstant^ specifies that the linear predictor has no intercept term, thus
    forcing it through the origin on the scale defined by the link function.

^scale(x2^|^dev^|#^)^ overrides the default scale parameter.  By default, ^scale(1)^ is
    assumed for discrete distributions (binomial, Poisson, negative binomial)
    and ^scale(x2)^ for continuous distributions (Gaussian, gamma, inverse
    Gaussian).

    ^scale(x2)^ specifies the scale parameter be set to the Pearson chi-squared
    (or generalized chi-squared) statistic divided by the residual degrees of
    freedom.

    ^scale(dev)^ sets the scale parameter to the deviance divided by the residual
    degrees of freedom.  This provides an alternative to ^scale(x2)^ for con-
    tinuous distributions and over- or under-dispersed discrete distributions.

    ^scale(^#^)^ sets the scale parameter to #.

[^ln^]^offset(^varname^)^ specifies an offset to be added to the linear predictor.
    ^offset()^ specifies the values directly:     g(E(y)) = xB + varname.
    ^lnoffset()^ specifies exponentiated values:  g(E(y)) = xB + ln(varname).

^disp(^#^)^ multiplies the variance of y by # and divides the deviance by #.  The
    resulting distributions are members of the quasi-likelihood family.

^eform^ displays the exponentiated coefficients and corresponding standard errors
    and confidence intervals.  For binomial models with the logit link, expo-
    nentiation results in odds ratios; for Poisson models with the log link,
    exponentiated coefficients are rate ratios.

^level(^#^)^ specifies the confidence level, in percent, for confidence intervals
    of the coefficients; see help @level@.

^iterate(^#^)^ specifies the maximum number of iterations allowed in estimating the
    model; ^iterate(50)^ is the default.

^ltol(^#^)^ specifies the convergence criterion for the change in deviance between
    iterations; ^ltol(1e-6)^ is the default.

^init(^varname^)^ specifies varname containing an initial estimate for the mean of
    depvar.  This can be useful if you encounter convergence difficulties,
    especially with binomial models with power or odds-power links.

^nolog^ suppresses the iteration log.


Options for @predict@
-------------------

^mu^, the default, requests the predicted value of y; y_hat = g_inverse(xb).

^xb^ requests the linear predictor xb.

^stdp^ requests the standard error of the linear predictor.

^deviance^ requests the deviance residuals.

^pearson^ requests Pearson residuals.

^nooffset^ is relevant only if you specified ^offset()^ or ^lnoffset()^ for ^glm^.  It
    modifies the calculations made by ^predict^ so that they ignore the offset
    variable; the linear prediction is treated as x_j*b rather than x_j*b +
    offset_j.


Remarks
-------

The allowed link functions are

	Link function               ^glm^ option
	------------------------------------------
	identity                    ^link(identity)^
	log                         ^link(log)^
	logit                       ^link(logit)^
	probit                      ^link(probit)^
	complementary log-log       ^link(cloglog)^
	odds power                  ^link(opower^ #^)^
	power                       ^link(power^ #^)^
	negative binomial           ^link(nbinomial)^
        Hakulinen & Tenkanen        ^link(ht varname)^

The allowed distribution families are

	Family                      ^glm^ option
	----------------------------------------------------------------
	Gaussian (normal)           ^family(gaussian)^  or  ^family(normal)^
	Inverse Gaussian            ^family(igaussian)^
	Bernoulli/binomial          ^family(binomial)^
	Poisson                     ^family(poisson)^
	Negative binomial           ^family(nbinomial)^
	Gamma                       ^family(gamma)^


The allowed combinations are

	      | id  log   logit   probit   cloglog   power   opower  nb   ht
--------------+---------------------------------------------------------------
Gaussian      |  x   x                                 x
inv. Gaussian |  x   x                                 x
binomial      |  x   x      x       x         x        x        x          x
Poisson       |  x   x                                 x
neg. binomial |  x   x                                 x              x
gamma         |  x   x                                 x


If you specify ^family()^ but not ^link()^, you obtain the canonical link for
the family:

		^family()^                default ^link()^
		----------------------------------------
		^family(gaussian)^        ^link(identity)^
		^family(igaussian)^       ^link(power -2)^
		^family(binomial)^        ^link(logit)^
		^family(poisson)^         ^link(log)^
		^family(nbinomial)^       ^link(log)^
		^family(gamma)^           ^link(power -1)^


Special comments on ^family(gaussian)^ models
-------------------------------------------

While ^glm^ can be used to fit linear regression (^family(gaussian) link(identity)^
models) and, in fact, does so by default, it is better to use the @regress@ com-
mand because it is quicker and numerous post-estimation commands are available
to explore the adequacy of the fit.


Special comments on ^family(binomial)^ models
-------------------------------------------

The binomial distribution can be specified

	(1)  ^family(binomial)^

	(2)  ^family(binomial^ #^)^

	(3)  ^family(binomial^ varname^)^

In case 2, # is the value of the binomial denominator N, the number of trials.
Specifying ^family(binomial 1)^ is the same as specifying ^family(binomial)^; both
mean that y has the Bernoulli distribution with values 0 and 1 only.

In case 3, varname is a variable containing the binomial denominator, thus
allowing the number of trials to vary across observations.

For ^family(binomial) link(logit)^ models, we recommend using the @logistic@ com-
mand in preference to ^glm^.  Both produce the same answers, but @logistic@ pro-
vides useful post-estimation commands.

For ^family(binomial) link(probit)^ models, we recommend using the @probit@ command
in preference to ^glm^.  Both produce the same coefficients, but the standard
errors are only asymptotically equivalent because probit is not the canonical
link for the binomial.  The @probit@ command produces full maximum-likelihood
results.

Special comments on ^family(binomial) link(ht varname)^ models
--------------------------------------------------------------

This is the Hakulinen an Tenkanen link function where ^varname^ is a variable
containing expected probabilities.  The link function is then 

                      log(-log(p1/p2))

where p1 are the observed probabilities and p2 are the expected probabilities
represented in ^varname^.

Special comments on ^family(nbinomial)^ models
--------------------------------------------

The negative binomial distribution can be specified as

	(1)  ^family(nbinomial)^

	(2)  ^family(nbinomial^ #^)^

^family(nbinomial)^ is equivalent to ^family(nbinomial 1)^.  #, often called k,
enters the variance and deviance functions; typical values range between .01
and 2.

^family(nbinomial) link(log)^ models -- also known as negative binomial regres-
sion -- are used for data with an overdispersed Poisson distribution.  While
^glm^ can be used to estimate such models, use of Stata's maximum-likelihood
@nbreg@ command is probably preferable.  Under the ^glm^ approach, one must search
for value of k that results in the deviance-based dispersion being 1.  @nbreg@,
on the other hand, finds the maximum-likelihood estimate of k and reports a
confidence interval for it.


Special comment on ^family(gamma) link(log)^ models
-------------------------------------------------

^glm^ can be used to estimate exponential regression, but this requires specify-
ing ^scale(1)^.  It is better to use the @ereg@ command.  ^glm^-reported standard
errors will be only asymptotically equivalent to those reported by @ereg@ because
log is not the canonical link for the gamma family.  In addition, ^glm^ cannot be
used to estimate exponential regressions on censored data.


Examples
--------

 . ^glm low age lwt race2 race3 smoke ptl ht ui, f(bin) l(logit)^
 . ^glm, eform^

 . ^glm dead ln_dose, family(binomial pop) link(logit)^
 . ^glm dead ln_dose, family(binomial pop) link(cloglog)^
 . ^predict e_deaths^
 . ^summarize dead e_deaths^
 . ^predict rd if e(sample), deviance^

 . ^xi: glm dead i.beetle ln_dose, f(bin pop) link(cl)^
 . ^xi: glm dead i.beetle*ln_dose, f(bin pop) link(cl)^
 . ^testparm I*^


Also see
--------

 Manual:  ^[U] 23 Estimation and post-estimation commands^,
	  ^[U] 29 Overview of model estimation in Stata^,
	  ^[R] glm^
On-line:  help for @est@, @postest@; @cloglog@, @logistic@, @nbreg@, @poisson@, @regdiag@,
		   @regress@, @streg@, @sw@, @weibull@, @xtgee@