Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Rescaling variables

From   "Nick Cox" <>
To   <>
Subject   st: RE: Rescaling variables
Date   Tue, 8 Aug 2006 23:50:26 +0100

There is some confusion here. There is 
a difference between 

(a) converting to a scale on which mean is 0 and sd 1 


(b) transforming to Gaussian (a.k.a. normal)

which is unaffected by the fact that people 
often want both. 

I can use -egen, std()- for (a) but 
this is just a linear rescaling and has no 
effect of the degree of non-Gaussianity in
the data. So, for example, skewness and 
kurtosis are invariant under linear rescalings. 

Thus suppose I have a variable which is 
42 most of the time and 3.14159 the rest 
of the time. In this example, two spikes will 
remain two spikes under any one-to-one mapping 
and a bell-shape will remain out of reach, indeed
out of sight. 

Your own distribution does not sound so refractory, 
but be aware of the principle that transformation
often fails. 

> I had another question on rescaling variables. At this 
> point, I am trying to convert my some of my independent 
> variables (running from 0 to 6) to standard normal 
> variables. I have been able to do it by subtracting the 
> numbers from the mean and dividing by the standard 
> deviation for each group (I have an unbalanced panel of 
> countries), but the problem comes for countries which had 
> no within variation for that variable, since dividing by 
> zero is giving missing observations in such cases. The 
> following command does not work:
> by code, sort: egen newvar=std(oldvar), since egen does 
> not work with by.
> Can anyone point me towards the correct command for direct 
> conversion of numbers into their standard normals thereby 
> circumventing the problem of dividing by zeros? Thanks as 
> usual!

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index