Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st:glm with bin family and link probit VS. probit

From	Judy You <[email protected]>
To	[email protected]
Subject	Re: st:glm with bin family and link probit VS. probit
Date	Wed, 1 Jun 2011 09:23:13 +0930

Dear Richard:

Thanks for your replying my question and your trusting probit over glm
in this case.

The reason that i run the two models is to check if they got the same
answer, as most of reference books said so. Now, we could see the
difference. The difference is even bigger when the marginal effect is
estimated after the modellings. Could any other experts explain if the
glm program could be improved?

Cheers

2011/5/31 Richard Williams <[email protected]>:
> My guess is that -glm- lacks the specialized code that detects the "predicts
> failure perfectly" variables. So, instead of vars and observations getting
> dropped, f15 and m15 stay in but their estimated standard errors are really
> large. I would trust the results from -probit- more than the results from
> -glm- (why do you need both anyway?). You could try -exlogistic- but your
> sample may be too big for that to be feasible.
>
> At 11:13 PM 5/30/2011, Judy You wrote:
>>
>> Dear Stata Experts:
>>
>>
>>
>> I have a question regards to comparisons of the two models: glm with
>> bin family and link probit VS. probit.
>>
>>
>>
>> The data is number of people who died from infection disease by age
>> group as follows.
>>
>>
>>
>> agegp  0          1
>>
>>
>>
>> 0          154573            2
>>
>> 15        97581  0
>>
>> 25        159888            9
>>
>> 40        191122            35
>>
>> 65        29329  20
>>
>>
>>
>> I got the different results by using glm with bin family and link
>> probit and probit. The main difference is that glm dropped the last
>> dummy variable "f65", while keep all the coefficient even with the zeo
>> values (eg., f15 and m15). The Probit dropped not only f65, but also
>> the zeo values eg., f15 and m15. The different results lead to
>> different estimation of marginal effects followed by the two models.
>> Any idea and advice how to control the two models using the same
>> independent dummy variables?
>>
>>
>>
>> Your help will be much appreciated!
>>
>>
>>
>> . glm  AB   m0- f65  [fw= ABfreq], f(b) l(probit) iterate(10)
>>
>> note: f65 omitted because of collinearity
>>
>>
>>
>> Iteration 0:   log likelihood =  -47450.19
>>
>> Iteration 1:   log likelihood = -825.64878
>>
>> Iteration 2:   log likelihood = -629.32298
>>
>> Iteration 3:   log likelihood =  -622.2711
>>
>> Iteration 4:   log likelihood = -621.33495
>>
>> Iteration 5:   log likelihood = -621.31077
>>
>> Iteration 6:   log likelihood = -621.30724
>>
>> Iteration 7:   log likelihood = -621.30711
>>
>> Iteration 8:   log likelihood = -621.30711
>>
>> Iteration 9:   log likelihood = -621.30711
>>
>> Iteration 10:  log likelihood = -621.30711
>>
>> convergence not achieved
>>
>>
>>
>> Generalized linear models                          No. of obs      =
>> 632559
>>
>> Optimization     : ML                              Residual df     =
>>     632549
>>
>> Scale parameter =      1
>>
>> Deviance         =  1242.614219                    (1/df) Deviance =
>>        .0019645
>>
>> Pearson          =   534978.001                    (1/df) Pearson  =
>> .8457495
>>
>>
>>
>> Variance function: V(u) = u*(1-u)                  [Bernoulli]
>>
>> Link function    : g(u) = invnorm(u)               [Probit]
>>
>>
>>
>> AIC             =   .001996
>>
>> Log likelihood   = -621.3071096                    BIC             =
>>  -8448049
>>
>>
>>
>>
>>
>> OIM
>>
>> AB       Coef.   Std. Err.      z    P>z     [95% Conf.     Interval]
>>
>>
>>
>> m0   -.9250873   .2495697    -3.71   0.000    -1.414235         -.4359398
>>
>> m15   -2.901722   7.830577    -0.37   0.711    -18.24937       12.44593
>>
>> m25   -.5044691    .146919    -3.43   0.001     -.792425         -.2165132
>>
>> m40   -.2180508   .1199777    -1.82   0.069    -.4532027       .0171012
>>
>> m65    .1468668    .133816     1.10   0.272    -.1154077         .4091414
>>
>> f0   -.9122453   .2501379    -3.65   0.000    -1.402507
>> -.4219841
>>
>> f15   -2.901722   8.073848    -0.36   0.719    -18.72617         12.92273
>>
>> f25   -.6696904   .1741904    -3.84   0.000    -1.011097         -.3282836
>>
>> f40   -.3572294   .1297122    -2.75   0.006    -.6114606         -.1029981
>>
>> f65   (omitted)
>>
>> _cons   -3.288246   .1063749   -30.91   0.000    -3.496737   -3.079755
>>
>>
>>
>> . probit  AB   m0- f65  [fw= ABfreq], iterate(10)
>>
>>
>>
>> note: m15 != 0 predicts failure perfectly
>>
>> m15 dropped and 190 obs not used
>>
>>
>>
>> note: f15 != 0 predicts failure perfectly
>>
>> f15 dropped and 190 obs not used
>>
>>
>>
>> note: f65 omitted because of collinearity
>>
>> Iteration 0:   log likelihood = -660.01746
>>
>> Iteration 1:   log likelihood =  -629.8237
>>
>> Iteration 2:   log likelihood =  -621.8218
>>
>> Iteration 3:   log likelihood = -621.31032
>>
>> Iteration 4:   log likelihood = -621.30614
>>
>> Iteration 5:   log likelihood = -621.30613
>>
>>
>>
>> Probit regression                                 Number of obs   =
>>  534978
>>
>> LR chi2(7)      =           77.42
>>
>> Prob > chi2     =          0.0000
>>
>> Log likelihood = -621.30613                       Pseudo R2       = 0.0587
>>
>>
>>
>>
>>
>> AB       Coef.   Std. Err.      z    P>z     [95% Conf.     Interval]
>>
>>
>>
>> m0   -.9250871   .2495696    -3.71   0.000    -1.414235         -.4359397
>>
>> m15   (omitted)
>>
>> m25   -.5044691    .146919    -3.43   0.001     -.792425         -.2165132
>>
>> m40   -.2180508   .1199777    -1.82   0.069    -.4532027       .0171012
>>
>> m65    .1468668    .133816     1.10   0.272    -.1154077         .4091414
>>
>> f0   -.9122452   .2501378    -3.65   0.000    -1.402506
>> -.4219841
>>
>> f15   (omitted)
>>
>> f25   -.6696904   .1741904    -3.84   0.000    -1.011097         -.3282836
>>
>> f40   -.3572294   .1297122    -2.75   0.006    -.6114606         -.1029981
>>
>> f65   (omitted)
>>
>> _cons   -3.288246   .1063749   -30.91   0.000    -3.496737   -3.079755
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>
> -------------------------------------------
> Richard Williams, Notre Dame Dept of Sociology
> OFFICE: (574)631-6668, (574)631-6463
> HOME:   (574)289-5227
> EMAIL:  [email protected]
> WWW:    http://www.nd.edu/~rwilliam
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st:glm with bin family and link probit VS. probit
  - From: Richard Williams <[email protected]>

References:
- st:glm with bin family and link probit VS. probit
  - From: Judy You <[email protected]>
- Re: st:glm with bin family and link probit VS. probit
  - From: Richard Williams <[email protected]>

Prev by Date: st: problem with levpet command
Next by Date: Re: st: Using sampling/probability weights for mixed design ANOVA in STATA
Previous by thread: Re: st:glm with bin family and link probit VS. probit
Next by thread: Re: st:glm with bin family and link probit VS. probit
Index(es):
- Date
- Thread