Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Choosing a family using glm

From	Laurie Molina <[email protected]>
To	[email protected]
Subject	Re: st: Choosing a family using glm
Date	Tue, 24 Aug 2010 17:28:22 -0500

Thank you very much phil, i will work on that!


On Tue, Aug 24, 2010 at 4:14 PM, Phil Schumm <[email protected]> wrote:
> On Aug 24, 2010, at 12:10 PM, Laurie Molina wrote:
>>
>> I'm trying to fit a glm to get non negative fitted values.  I am thinking
>> to use a glm with a log link.  But i am  not sure about wich family to use.
>>  Is there any test i can perform to choose between the normal and gamma
>> distribution?
>
>
> Everything Nick said is correct, of course -- I'll just expand a bit.  WRT
> the distributional family, what is most important is that the variance
> function of the family (i.e., the way in which the variance changes WRT the
> mean) is consistent with your data.  For example, the variance function for
> the Normal distribution is V(mu) = 1 (where mu is E(Y) or the mean of Y),
> which corresponds to constant variance (i.e., this is why you look for
> homoscedasticity in residual plots after classical linear regression).  In
> contrast, the variance function for the gamma distribution is V(mu) = mu^2,
> which means that the variance increases with the square of the mean (i.e.,
> constant coefficient of variation).  The easiest (and in any case
> indispensable) way to check if your variance function is plausible is to
> plot the standardized residuals versus the fitted values and verify that the
> amount of variation appears constant; in some cases it might be helpful to
> examine a plot of the absolute residuals versus the fitted values, together
> with the aid of -lowess-.
>
>
>> My data is for the rent price of houses, so it is not count data and
>> therefore i think i should not use poisson.
>
>
> Again, what's important is that you select a family whose variance function
> is consistent with your data.  For more information, see the book
> Generalized Linear Models by McCullagh and Nelder.
>
>
>> To my understend in a clasical linear regression the asumption of
>> normality is in the distribution of the error term, but in glm the asumption
>> defined by the family selection is on the distribution of the dependent
>> variable. Isnt that a huge cost for using glm instead of a clasical linear
>> regression model?
>
>
> You are laboring under a misunderstanding.  To say that the distribution of
> Y conditional on X is Normal with mean XB and variance sigma^2 is the same
> as saying that the distribution of the errors (i.e., Y - XB) is Normal with
> mean 0 and variance sigma^2.  And to emphasize the GLM approach, what is
> most important (if you're fitting a linear regression) is that the mean is
> XB and the variance is constant (i.e., that your assumptions about the first
> and second moments are correct).
>
>
> -- Phil
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Choosing a family using glm
  - From: Laurie Molina <[email protected]>
- Re: st: Choosing a family using glm
  - From: Phil Schumm <[email protected]>

Prev by Date: Re: st: AW: Multiple Imputation on Panel Data: all variables have missing data, and the panels are expanding
Next by Date: Re: st: Choosing a family using glm
Previous by thread: Re: st: Choosing a family using glm
Next by thread: Re: st: Choosing a family using glm
Index(es):
- Date
- Thread