Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Output from SKTEST


From   Maarten Buis <maartenlbuis@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Output from SKTEST
Date   Mon, 2 Apr 2012 10:55:15 +0200

On Sat, Mar 31, 2012 at 11:03 PM, David Hoaglin wrote:
> Before you use tests based on sample skewness and kurtosis, it would
> be a good idea to look at a normal probability plot for each of the
> variables.  A histogram may also be useful, but it is not sufficient,
> because it will not show you enough about the tails.  You should look
> for outliers, as well as evidence of multimodality.

I agree. A nice compromise between the histogram and a normal
probability plot is the hanging rootogram. In Stata these can be drawn
with the -hangroot- package, which is available from SSC (-ssc install
hangroot-). Examples can be found on
<http://www.maartenbuis.nl/software/hangroot.htm>. In general, I would
not see these graphs as competitors but as complementary ways of
inspecting the data: each graph is good at highlighting different
aspects.

Moreover, if these variables are part of a regression(-like) model
than you are typically not interested in the uni-variate
normality/Gaussianity. If your variable is the
dependent/explained/left-hand-side/y variable than an assumption of
minor importance  is that the variable is normally distributed
_conditional_ on the independent/explanatory/right-hand-side/x
variable(s). To check that assumption you can after a linear
regression inspect the residuals or use -margdistfit- (also available
from SSC: -ssc install margdistfit-). Examples of the latter package
can be found here:
-http://www.maartenbuis.nl/software/margdistfit.html-. If the variable
is an independent/explanatory/right-hand-side/x-variable than the
distribution is even less important. Its only use is that it can
sometimes give a clue on the functional form of the relationship
between that x and y variable.

Hope this helps,
Maarten

--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany


http://www.maartenbuis.nl
--------------------------

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index