Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st:glm with bin family and link probit VS. probit

 From Richard Williams To statalist@hsphsun2.harvard.edu, statalist@hsphsun2.harvard.edu Subject Re: st:glm with bin family and link probit VS. probit Date Tue, 31 May 2011 00:43:54 -0500

My guess is that -glm- lacks the specialized code that detects the "predicts failure perfectly" variables. So, instead of vars and observations getting dropped, f15 and m15 stay in but their estimated standard errors are really large. I would trust the results from -probit- more than the results from -glm- (why do you need both anyway?). You could try -exlogistic- but your sample may be too big for that to be feasible.
```
At 11:13 PM 5/30/2011, Judy You wrote:
```
```Dear Stata Experts:

I have a question regards to comparisons of the two models: glm with
bin family and link probit VS. probit.

The data is number of people who died from infection disease by age
group as follows.

agegp  0          1

0          154573            2

15        97581  0

25        159888            9

40        191122            35

65        29329  20

I got the different results by using glm with bin family and link
probit and probit. The main difference is that glm dropped the last
dummy variable "f65", while keep all the coefficient even with the zeo
values (eg., f15 and m15). The Probit dropped not only f65, but also
the zeo values eg., f15 and m15. The different results lead to
different estimation of marginal effects followed by the two models.
Any idea and advice how to control the two models using the same
independent dummy variables?

Your help will be much appreciated!

. glm  AB   m0- f65  [fw= ABfreq], f(b) l(probit) iterate(10)

note: f65 omitted because of collinearity

Iteration 0:   log likelihood =  -47450.19

Iteration 1:   log likelihood = -825.64878

Iteration 2:   log likelihood = -629.32298

Iteration 3:   log likelihood =  -622.2711

Iteration 4:   log likelihood = -621.33495

Iteration 5:   log likelihood = -621.31077

Iteration 6:   log likelihood = -621.30724

Iteration 7:   log likelihood = -621.30711

Iteration 8:   log likelihood = -621.30711

Iteration 9:   log likelihood = -621.30711

Iteration 10:  log likelihood = -621.30711

convergence not achieved

Generalized linear models                          No. of obs      =   632559

Optimization     : ML                              Residual df     =
632549

Scale parameter =      1

Deviance         =  1242.614219                    (1/df) Deviance =
.0019645

Pearson          =   534978.001                    (1/df) Pearson  = .8457495

Variance function: V(u) = u*(1-u)                  [Bernoulli]

Link function    : g(u) = invnorm(u)               [Probit]

AIC             =   .001996

Log likelihood   = -621.3071096                    BIC             =
-8448049

OIM

AB       Coef.   Std. Err.      z    P>z     [95% Conf.     Interval]

m0   -.9250873   .2495697    -3.71   0.000    -1.414235         -.4359398

m15   -2.901722   7.830577    -0.37   0.711    -18.24937       12.44593

m25   -.5044691    .146919    -3.43   0.001     -.792425         -.2165132

m40   -.2180508   .1199777    -1.82   0.069    -.4532027       .0171012

m65    .1468668    .133816     1.10   0.272    -.1154077         .4091414

f0   -.9122453   .2501379    -3.65   0.000    -1.402507           -.4219841

f15   -2.901722   8.073848    -0.36   0.719    -18.72617         12.92273

f25   -.6696904   .1741904    -3.84   0.000    -1.011097         -.3282836

f40   -.3572294   .1297122    -2.75   0.006    -.6114606         -.1029981

f65   (omitted)

_cons   -3.288246   .1063749   -30.91   0.000    -3.496737   -3.079755

. probit  AB   m0- f65  [fw= ABfreq], iterate(10)

note: m15 != 0 predicts failure perfectly

m15 dropped and 190 obs not used

note: f15 != 0 predicts failure perfectly

f15 dropped and 190 obs not used

note: f65 omitted because of collinearity

Iteration 0:   log likelihood = -660.01746

Iteration 1:   log likelihood =  -629.8237

Iteration 2:   log likelihood =  -621.8218

Iteration 3:   log likelihood = -621.31032

Iteration 4:   log likelihood = -621.30614

Iteration 5:   log likelihood = -621.30613

```
Probit regression Number of obs = 534978
```
LR chi2(7)      =           77.42

Prob > chi2     =          0.0000

Log likelihood = -621.30613                       Pseudo R2       = 0.0587

AB       Coef.   Std. Err.      z    P>z     [95% Conf.     Interval]

m0   -.9250871   .2495696    -3.71   0.000    -1.414235         -.4359397

m15   (omitted)

m25   -.5044691    .146919    -3.43   0.001     -.792425         -.2165132

m40   -.2180508   .1199777    -1.82   0.069    -.4532027       .0171012

m65    .1468668    .133816     1.10   0.272    -.1154077         .4091414

f0   -.9122452   .2501378    -3.65   0.000    -1.402506           -.4219841

f15   (omitted)

f25   -.6696904   .1741904    -3.84   0.000    -1.011097         -.3282836

f40   -.3572294   .1297122    -2.75   0.006    -.6114606         -.1029981

f65   (omitted)

_cons   -3.288246   .1063749   -30.91   0.000    -3.496737   -3.079755

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```
```
-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME:   (574)289-5227
EMAIL:  Richard.A.Williams.5@ND.Edu
WWW:    http://www.nd.edu/~rwilliam

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```