Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: Re: st: lnskew0 and bcskew0


From   n j cox <[email protected]>
To   [email protected]
Subject   Re: Re: st: lnskew0 and bcskew0
Date   Tue, 24 Apr 2007 22:31:24 +0100

I am not sure that being affine is to the point;
linearity is the pertinent property, I think.
Otherwise I agree with Austin. Also, several
excellent points made by Maarten Buis earlier are
still, it seems, being overlooked here.

In general, log(x +/- constant) is not a transformation
I would ever use with percents. Rarely, if ever, does the constant
come with a story attached.

To restate various points more emphatically:

1. Having a skewness of 0 and being symmetric are not in
general identical, as any measure of skewness could be near 0
in some asymmetric distribution, surprised though one might
be by stark asymmetry when the chosen measure is 0.

(0, 0, 0, 1, 1, 1, 3) satisfies mean = median
and thus (mean - median) / spread = 0.

2. Being symmetric and being normal (Gaussian) are also
not in general identical, as many symmetric distributions
are not Gaussian.

Broadly speaking, -lnskew0- is better thought of as a way
of fitting three-parameter lognormals, not a very general
way of symmetrising (let alone normalising) data. Unless
there are grounds for thinking that the data should be
lognormal, there are essentially no guarantees here.

Nick
[email protected]

Austin Nichols

My suggestion was a bit tongue in cheek--it is not a z-score, as that
is an affine transformation (-ssc inst center- to use -center- to make
z-scores using the s option), whereas "sort mpg // g
z=invnorm(_n/(_N+1))" is nonlinear and makes a variable look very
normal indeed...
You should probably not transform your variable at all.
Why aren't you are using -ice-and -mim- from SSC? -ssc inst ice- and
-ssc inst mim- put both at your disposal.

tdavis7
>
> Thank you for your response.  But I have one last question.  The latter
> transformation appears to be a simple z score?  Is that correct?  I ask
> because this changes my regression results a bit and I want to make
> sure that I haven't performed some obscure transformation that I am
> unable to explain.
>
> I am concerned about normality because I am creating multiple imputed
> data sets using Amelia with the data that I currently have.  One of the
> assumptions of multiple imputation is normality (univariate at the
> least but mulitvariate ideally).  I plan on using STATA to estimated
> OLS regression coeffients with the imputed data, but I also plan to do
> some SEM and HLM with the imputed data.  Can I still use the "g
> z=invnorm(_n/(_N+1))" transformation or should I stick with lnskew0
> even though the histogram appears skewed despite the acutal skew
> statistic?

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index