Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: RE: Choosing a family using glm


From   Laurie Molina <molinalaurie@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: RE: Choosing a family using glm
Date   Tue, 24 Aug 2010 12:38:46 -0500

Thank you very much Nick,
About 4. The thing is that i am doing this regressions for predictive
pourposes, so i think i have to choose one model to use it for the
rest of my work...
And regarding my last question, probably i am not entirely
understanding the glm: doens't it assume that the dependent variable
is generated by the process defined by the family? While CRM (clasical
regresion model) has no assumption on the data generating process but
on the distribution of the error terms? I hope you could help me to
clarify this. Thank you!




On Tue, Aug 24, 2010 at 12:28 PM, Nick Cox <n.j.cox@durham.ac.uk> wrote:
> I don't think you should think in terms of a single test. That would be as naïve for this problem as for many others.
>
> 0. Sometimes one or other approach just won't converge, so that's a sign. Conversely, there are datasets in which model fits are very similar.
>
> 1. Look at the -glm- output, including the z's, the p's and the log likelihood.
>
> 2. Plot residuals vs fitted and observed vs fitted for each family. Plot fitted for normal versus fitted for gamma and see how much difference family choice it makes.
>
> 3. Examine whether predictions make scientific sense for cases of most interest.
>
> 4. Why feel obliged to choose one model? Perhaps two models together tell you something.
>
> Your last question appears to be based on a confusion. -glm- is one kind of generalisation of linear regression. Like regression, the central focus is the distribution of the response variable conditional on the predictors, not the marginal distribution of the response. Also, using link functions removes much of the adhockery necessitated by transformations. So, the short answer is emphatically No.
>
> Nick
> n.j.cox@durham.ac.uk
>
> Laurie Molina
>
> I'm trying to fit a glm to get non negative fitted values.
> I am thinking to use a glm with a log link.
> But i am  not sure about wich family to use.
>
> Is there any test i can perform to choose between the normal and gamma
> distribution?
>
> My data is for the rent price of houses, so it is not count data and
> therefore i think i should not use poisson.
>
> And just one more question:
>
> To my understend in a clasical linear regression the asumption of
> normality is in the distribution of the error term, but in glm the
> asumption defined by the family selection is on the distribution of
> the dependent variable. Isnt that a huge cost for using glm instead of
> a clasical linear regression model?
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index