Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st:glm with bin family and link probit VS. probit


From   Judy You <joodyu@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st:glm with bin family and link probit VS. probit
Date   Wed, 1 Jun 2011 09:23:13 +0930

Dear Richard:

Thanks for your replying my question and your trusting probit over glm
in this case.

The reason that i run the two models is to check if they got the same
answer, as most of reference books said so. Now, we could see the
difference. The difference is even bigger when the marginal effect is
estimated after the modellings. Could any other experts explain if the
glm program could be improved?

Cheers

2011/5/31 Richard Williams <richardwilliams.ndu@gmail.com>:
> My guess is that -glm- lacks the specialized code that detects the "predicts
> failure perfectly" variables. So, instead of vars and observations getting
> dropped, f15 and m15 stay in but their estimated standard errors are really
> large. I would trust the results from -probit- more than the results from
> -glm- (why do you need both anyway?). You could try -exlogistic- but your
> sample may be too big for that to be feasible.
>
> At 11:13 PM 5/30/2011, Judy You wrote:
>>
>> Dear Stata Experts:
>>
>>
>>
>> I have a question regards to comparisons of the two models: glm with
>> bin family and link probit VS. probit.
>>
>>
>>
>> The data is number of people who died from infection disease by age
>> group as follows.
>>
>>
>>
>> agegp  0          1
>>
>>
>>
>> 0          154573            2
>>
>> 15        97581  0
>>
>> 25        159888            9
>>
>> 40        191122            35
>>
>> 65        29329  20
>>
>>
>>
>> I got the different results by using glm with bin family and link
>> probit and probit. The main difference is that glm dropped the last
>> dummy variable "f65", while keep all the coefficient even with the zeo
>> values (eg., f15 and m15). The Probit dropped not only f65, but also
>> the zeo values eg., f15 and m15. The different results lead to
>> different estimation of marginal effects followed by the two models.
>> Any idea and advice how to control the two models using the same
>> independent dummy variables?
>>
>>
>>
>> Your help will be much appreciated!
>>
>>
>>
>> . glm  AB   m0- f65  [fw= ABfreq], f(b) l(probit) iterate(10)
>>
>> note: f65 omitted because of collinearity
>>
>>
>>
>> Iteration 0:   log likelihood =  -47450.19
>>
>> Iteration 1:   log likelihood = -825.64878
>>
>> Iteration 2:   log likelihood = -629.32298
>>
>> Iteration 3:   log likelihood =  -622.2711
>>
>> Iteration 4:   log likelihood = -621.33495
>>
>> Iteration 5:   log likelihood = -621.31077
>>
>> Iteration 6:   log likelihood = -621.30724
>>
>> Iteration 7:   log likelihood = -621.30711
>>
>> Iteration 8:   log likelihood = -621.30711
>>
>> Iteration 9:   log likelihood = -621.30711
>>
>> Iteration 10:  log likelihood = -621.30711
>>
>> convergence not achieved
>>
>>
>>
>> Generalized linear models                          No. of obs      =
>> 632559
>>
>> Optimization     : ML                              Residual df     =
>>     632549
>>
>> Scale parameter =      1
>>
>> Deviance         =  1242.614219                    (1/df) Deviance =
>>        .0019645
>>
>> Pearson          =   534978.001                    (1/df) Pearson  =
>> .8457495
>>
>>
>>
>> Variance function: V(u) = u*(1-u)                  [Bernoulli]
>>
>> Link function    : g(u) = invnorm(u)               [Probit]
>>
>>
>>
>> AIC             =   .001996
>>
>> Log likelihood   = -621.3071096                    BIC             =
>>  -8448049
>>
>>
>>
>>
>>
>> OIM
>>
>> AB       Coef.   Std. Err.      z    P>z     [95% Conf.     Interval]
>>
>>
>>
>> m0   -.9250873   .2495697    -3.71   0.000    -1.414235         -.4359398
>>
>> m15   -2.901722   7.830577    -0.37   0.711    -18.24937       12.44593
>>
>> m25   -.5044691    .146919    -3.43   0.001     -.792425         -.2165132
>>
>> m40   -.2180508   .1199777    -1.82   0.069    -.4532027       .0171012
>>
>> m65    .1468668    .133816     1.10   0.272    -.1154077         .4091414
>>
>> f0   -.9122453   .2501379    -3.65   0.000    -1.402507
>> -.4219841
>>
>> f15   -2.901722   8.073848    -0.36   0.719    -18.72617         12.92273
>>
>> f25   -.6696904   .1741904    -3.84   0.000    -1.011097         -.3282836
>>
>> f40   -.3572294   .1297122    -2.75   0.006    -.6114606         -.1029981
>>
>> f65   (omitted)
>>
>> _cons   -3.288246   .1063749   -30.91   0.000    -3.496737   -3.079755
>>
>>
>>
>> . probit  AB   m0- f65  [fw= ABfreq], iterate(10)
>>
>>
>>
>> note: m15 != 0 predicts failure perfectly
>>
>> m15 dropped and 190 obs not used
>>
>>
>>
>> note: f15 != 0 predicts failure perfectly
>>
>> f15 dropped and 190 obs not used
>>
>>
>>
>> note: f65 omitted because of collinearity
>>
>> Iteration 0:   log likelihood = -660.01746
>>
>> Iteration 1:   log likelihood =  -629.8237
>>
>> Iteration 2:   log likelihood =  -621.8218
>>
>> Iteration 3:   log likelihood = -621.31032
>>
>> Iteration 4:   log likelihood = -621.30614
>>
>> Iteration 5:   log likelihood = -621.30613
>>
>>
>>
>> Probit regression                                 Number of obs   =
>>  534978
>>
>> LR chi2(7)      =           77.42
>>
>> Prob > chi2     =          0.0000
>>
>> Log likelihood = -621.30613                       Pseudo R2       = 0.0587
>>
>>
>>
>>
>>
>> AB       Coef.   Std. Err.      z    P>z     [95% Conf.     Interval]
>>
>>
>>
>> m0   -.9250871   .2495696    -3.71   0.000    -1.414235         -.4359397
>>
>> m15   (omitted)
>>
>> m25   -.5044691    .146919    -3.43   0.001     -.792425         -.2165132
>>
>> m40   -.2180508   .1199777    -1.82   0.069    -.4532027       .0171012
>>
>> m65    .1468668    .133816     1.10   0.272    -.1154077         .4091414
>>
>> f0   -.9122452   .2501378    -3.65   0.000    -1.402506
>> -.4219841
>>
>> f15   (omitted)
>>
>> f25   -.6696904   .1741904    -3.84   0.000    -1.011097         -.3282836
>>
>> f40   -.3572294   .1297122    -2.75   0.006    -.6114606         -.1029981
>>
>> f65   (omitted)
>>
>> _cons   -3.288246   .1063749   -30.91   0.000    -3.496737   -3.079755
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>
> -------------------------------------------
> Richard Williams, Notre Dame Dept of Sociology
> OFFICE: (574)631-6668, (574)631-6463
> HOME:   (574)289-5227
> EMAIL:  Richard.A.Williams.5@ND.Edu
> WWW:    http://www.nd.edu/~rwilliam
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index