Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: RE: Choosing a family using glm

From   Nick Cox <>
To   "''" <>
Subject   RE: st: RE: Choosing a family using glm
Date   Tue, 24 Aug 2010 19:26:46 +0100

Comments below. 


Laurie Molina

Thank you very much Nick,
About 4. The thing is that i am doing this regressions for predictive
pourposes, so i think i have to choose one model to use it for the
rest of my work...

NJC >>> I don't see why. Model choice is not monogamy, but it's your decision either way. 

And regarding my last question, probably i am not entirely
understanding the glm: doens't it assume that the dependent variable
is generated by the process defined by the family? While CRM (clasical
regresion model) has no assumption on the data generating process but
on the distribution of the error terms? I hope you could help me to
clarify this. Thank you!

NJC >>> I don't want to seem rude, but this looks like the same question all over again and so my answer is the same, No, and for the same reasons. If this is the model you are going to use, you just need to start reading. The manual entry for -glm- is naturally one possibility.  

On Tue, Aug 24, 2010 at 12:28 PM, Nick Cox <> wrote:

> I don't think you should think in terms of a single test. That would be as naïve for this problem as for many others.
> 0. Sometimes one or other approach just won't converge, so that's a sign. Conversely, there are datasets in which model fits are very similar.
> 1. Look at the -glm- output, including the z's, the p's and the log likelihood.
> 2. Plot residuals vs fitted and observed vs fitted for each family. Plot fitted for normal versus fitted for gamma and see how much difference family choice it makes.
> 3. Examine whether predictions make scientific sense for cases of most interest.
> 4. Why feel obliged to choose one model? Perhaps two models together tell you something.
> Your last question appears to be based on a confusion. -glm- is one kind of generalisation of linear regression. Like regression, the central focus is the distribution of the response variable conditional on the predictors, not the marginal distribution of the response. Also, using link functions removes much of the adhockery necessitated by transformations. So, the short answer is emphatically No.

Laurie Molina

> I'm trying to fit a glm to get non negative fitted values.
> I am thinking to use a glm with a log link.
> But i am  not sure about wich family to use.
> Is there any test i can perform to choose between the normal and gamma
> distribution?
> My data is for the rent price of houses, so it is not count data and
> therefore i think i should not use poisson.
> And just one more question:
> To my understend in a clasical linear regression the asumption of
> normality is in the distribution of the error term, but in glm the
> asumption defined by the family selection is on the distribution of
> the dependent variable. Isnt that a huge cost for using glm instead of
> a clasical linear regression model?

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index