Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Re: st: Automatic fit of distribution


From   Richard Williams <[email protected]>
To   [email protected], [email protected]
Subject   Re: Re: st: Automatic fit of distribution
Date   Thu, 11 Jul 2013 13:04:56 -0500

Changing the subject slightly -- it is often recommended that you examine your data, e.g. do graphs or whatever, run various diagnostics. I am inclined to agree; indeed I always tell people to start with assorted descriptive statistics before launching into their high tech models. However, things like stepwise regression are widely condemned. Again I am inclined to agree, but I have a hard time explaining what exactly the difference is. In both cases, aren't you looking at the data first and using that information to guide your model building? By graphing the data first, couldn't that lead to over-fitting, and run the risk that analysis with different data would lead to different results? If, say, my visual examination or diagnostics have led me to add squared terms or even use a different statistical method, aren't my p values misleading? It seems like a lot of the cautions and concerns raised with stepwise could also be raised for approaches that are considered much more acceptable. My instincts go with the conventional wisdom but I am not sure how I would respond if pressed on these matters.

At 11:29 AM 7/11/2013, David Hoaglin wrote:
Diagnostics are fine, but there is no sustitute for looking at the
data (e.g., in well-chosen histograms and quantile-quantile plots).
Programs that rely on the sample skewness and kurtosis will be blind
to mixtures that show more than one mode, and the sensitivity of
sample moments to outliers makes those measures unsuitable for
diagnosing distribution shape.

Also, the process should take into account whether the data are
continuous or discrete.

David Hoaglin

On Thu, Jul 11, 2013 at 11:45 AM, Ariel Linden. DrPH
<[email protected]> wrote:
> I completely agree with Nick and Maarten that the user should do the work
> required to determine what type of distribution they are dealing with and go
> from there. However, it seems to me that there could be a program that
> "points the user in the right direction" after running a few simple
> diagnostics. For example, there are several programs already available to
> test for normality (ie., -sktest-, -swilk-, -ksmirnov-). It would be rather
> straightforward to test for a Poisson distribution based on the variance =
> mean. It would get harder as we go to other distributions, or fall between
> choices...
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME:   (574)289-5227
EMAIL:  [email protected]
WWW:    http://www.nd.edu/~rwilliam

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index