Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: van der Waerden transformation

From   Austin Nichols <>
Subject   Re: st: van der Waerden transformation
Date   Fri, 13 Apr 2012 12:18:11 -0400

A complete answer requires complete exposition of IRT, but the quick
answer is yes, more or less.
If you think underlying "achievement" is normally distributed, and you
used a reasonably well-designed test, you should convert the scores
back into a normal distribution as done via more sophisticated methods
on virtually every standardized test; the measure of latent
"achievement" is typically called theta.
Given that tests do not uniformly cover the difficulty space, there
will be skew and other nonnormality in scores, but a perfect test
(where the definition of perfect depends on what the test is to be
used for) might show a uniform distribution in percent correct from
zero to 100, which one could then turn back into a normal distribution
easily enough.  The distances then might give a reasonable measure of
how much harder it is to go from 98 to 99 than from 49 to 50 on this
hypothetical perfect test.
I have argued in print elsewhere that "achievement" is not normally
distributed, but let's leave that aside for now...  as no more
objectionable than assumptions in many -xt- commands on normality of
e.g. random effects/coefs.

On Fri, Apr 13, 2012 at 3:33 AM, Maarten Buis <> wrote:
> On Thu, Apr 12, 2012 at 7:01 PM, Austin Nichols <> wrote:
>> Maarten--
>> how about test scores?
> Why would you want to make up distances between ranks in test scores?
> I can see why many of these do not have a natural unit, so some form
> of standardization is called for, but that does not mean that they
> should be forced into a normal/Gaussian distribution. If you find
> considerable skewness in your raw scores, would the forced to be
> normal variable really be a better represenation of what you found?
> -- Maarten
>> On Thu, Apr 12, 2012 at 12:42 PM, Maarten Buis <> wrote:
>>> On Thu, Apr 12, 2012 at 6:11 PM, Scott Merryman wrote:
>>>> Isn't the van der Waerden transformation just inverse_normal(rank/(N +1)) ?
>>> That sounds like an awful idea. That way you are just "inventing"
>>> distances between ranks that have nothing to do with what you
>>> observed. If you (generally speaking, not Scott specifically) really
>>> want to get rid of the skewness that badly, than just use the
>>> percentile rank and be honest about the fact that you have thrown away
>>> the information on the distances between the ranks rather than making
>>> those distances up. In general, I would _not_ try to get rid of the
>>> skewness, but rather use it. If it is a dependent variable that might
>>> suggest a -glm- with maybe a log link function. If it is an
>>> independent variable it might suggest a non-linear effect possibly to
>>> be modeled with splines (see: -mkspline-).
>>> I would be interested to hear if someone knows of an application where
>>> this transformation would make some sense. I cannot imagine one, but
>>> that may just be due to my lack of imagination.
>>> -- Maarten
*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index