Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Create a normalized variable


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   RE: st: Create a normalized variable
Date   Thu, 20 Nov 2008 17:56:53 -0000

Maarten's warning is well taken. Roughly speaking, my experience is that
natural scientists (physicists, biologists, etc.) are more likely to
take normalised as meaning scaled to [0, 1], far more commonly by (value
- min) / (max - min) than by percentile ranks. The more statistics you
know, the more likely you are to regard (value - mean) / sd as a natural
standardisation. 

Nick 
[email protected] 

Maarten buis

This very much depends on what you mean with normalized. 

Sometimes it means a transformation that will result in a variable that
is nearer to a normal (Gaussian) distribution. You cannot mean that, as
than the resulting variable cannot range between 0 and 1 (as a normal
distribution ranges between minus infinity and plus infinity). 

Sometimes, the term is used for standardization. A common method is to
subtract the mean and divide by the standard deviation (the results are
sometimes called z-scores). Again you cannot mean that as the resulting
variable will not range between 0 and 1. 

An alternative way to standardize would be to use percentile ranks,
which gives for each respondent the proportion of respondents thas
smaller, poorer, dumber, etc than that respondent. This will give you a
standardized variable wich ranges between 0 and 1. The downside of this
approach is that it is less common so you have more to explain, and
that it is a non-linear transformation, in particular you only keep
information obout the ordering of individuals and loose information
about the distances between them. Don't get me wrong though, I like
this form of standardization, it is just not suitable for every
application (which should not be a big surprise).  The way to compute
these is discussed here:
http://www.stata.com/support/faqs/stat/pcrank.html

Finaly, a linear transformation that will lead to a score between 0 and
1 is to simply divide that variable by 8. 


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index