Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: conditional SE of y|X in glm


From   Nick Cox <[email protected]>
To   "'[email protected]'" <[email protected]>
Subject   RE: st: conditional SE of y|X in glm
Date   Tue, 24 Apr 2012 13:39:55 +0100

I don't understand "replicating" here but do please look in the manual for questions like that. 

Nick 
[email protected] 

Marco Ventura

Thank you Nick,
could you please tell me what Stata does exactly for replicating 
e(dispers) when family is not normal?

4/04/2012 13:29, Nick Cox ha scritto:
> With this model, as with every other, you have to decide what you mean by "prediction", i.e. on what scale you are predicting.
>
> Also, I did write
>
> "I like to have such measures accessible for comparing -glm- results  with those of other models in which rmse appears naturally."
>
> and I think logit models are stretching the point.
>
> In essence, what -glmcorr- does in your example is either wrong or irrelevant, depending on your point of view. -glmcorr- can be reconciled with those results by doing instead
>
> . gen fraction = r/n
>
> . glm fraction ldose , link(logit)
>
> Iteration 0:   log likelihood =   3.345982
> Iteration 1:   log likelihood =  3.7166249
> Iteration 2:   log likelihood =  3.7245648
> Iteration 3:   log likelihood =   3.724566
> Iteration 4:   log likelihood =   3.724566
>
> Generalized linear models                          No. of obs      =        24
> Optimization     : ML                              Residual df     =        22
>                                                     Scale parameter =  .0468293
> Deviance         =  1.030244611                    (1/df) Deviance =  .0468293
> Pearson          =  1.030244611                    (1/df) Pearson  =  .0468293
>
> Variance function: V(u) = 1                        [Gaussian]
> Link function    : g(u) = ln(u/(1-u))              [Logit]
>
>                                                     AIC             = -.1437138
> Log likelihood   =  3.724566043                    BIC             = -68.88694
>
> ------------------------------------------------------------------------------
>               |                 OIM
>      fraction |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
> -------------+----------------------------------------------------------------
>         ldose |   22.43087   5.627079     3.99   0.000       11.402    33.45974
>         _cons |  -40.34087   10.10823    -3.99   0.000    -60.15264   -20.52909
> ------------------------------------------------------------------------------
>
> . glmcorr
>
>      fraction and predicted
>
>      Correlation          0.800
>      R-squared            0.640
>      Root MSE             0.216
>
> . di sqrt(e(dispers))
> .21640079
>
> However, that would lose some of the information in the data.
>
> Otherwise, -glmcorr- uses what -predict- produces by default; if that's wrong for your problem, so will the results be.
>
> Nick
> [email protected]
>
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of Marco Ventura
> Sent: 24 April 2012 10:29
> To: [email protected]
> Subject: Re: st: conditional SE of y|X in glm
>
> Dear Nick,
> thank you very much of your quick replies.
> Unfortunately there is something I still do not understand. If I do
> use http://www.stata-press.com/data/r10/beetle
> glm r ldose, fam(bin n) link (logit)
> di sqrt(e(dispers))
> glmcorr
> I get two very different values 4.065 against 13.179. Which of the two
> is correct?
>
> Thank you again.
> Marco
>
> Il 24/04/2012 10:57, Nick Cox ha scritto:
>> See -glmcorr- (SSC) for one approach here. That calculates an rmse
>> which appears similar, if not identical, to what you want. I like to
>> have such measures accessible for comparing -glm- results  with those
>> of other models in which rmse appears naturally. Perhaps it is a
>> comfort blanket, but there you go.
>>
>> Note that putting a constant into a variable is usually overkill as
>>
>> di sqrt(e(dispers))
>>
>> does the calculation. Use a scalar or local macro if you want to store
>> the value.
>>
>> On Tue, Apr 24, 2012 at 9:31 AM, Marco Ventura<[email protected]>   wrote:
>>
>>> from a GLM estimate I want to retrieve the conditional standard error of y
>>> given the covariates. If I do
>>>
>>> gen sigma=sqrt(e(dispers))
>>>
>>> do I always get the right thing independently of any family and link?
>>> Should I correct it by sqrt(e(dispers)* (_N-1)/_N)?
>>> And do you think I should instead use the Pearson residuals such as
>>>
>>> gen sigma=sqrt(e(dispers_p))
>>>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index