 Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

st: RE: normalize variables

 From "Nick Cox" To Subject st: RE: normalize variables Date Sun, 11 Apr 2010 17:57:23 +0100

The word "normalize" here evidently means scale to a [0,1] range.

Note first that using -egen- to do this is unnecessary unless you want
to do this panelwise.

su x1, meanonly
gen normal_x1 = (x1 - r(min)) / (r(max) - r(min))

If you want to do this panelwise, it does becomes convenient to use
-egen- as you say.

What I don't understand is how your main question can be answered
without knowing why you want to do this and why you think that you
"must" normalize. The best answer I can offer is that your indexes will
vary depending on whether they calculated w.r.t. the entire dataset or
individual panels, and the choice between them is a scientific or
substantive one.

Nick
n.j.cox@durham.ac.uk

Evangelos.Constantinou@warwick.ac.uk

I am using panel data analysis and I want to generate an index but first
I
must normalise the variables (x1,x2) contained in the index. I
normalised
them by the following set of commands:

egen min_x1=min(x1)
egen max_x1=max(x1)
gen normal_x1=(x1-min_x1)/(max_x1-min_x1)

So, my question is whether I need to transform the commands to include
the
"by(.)" option i.e.

egen min_x1=min(x1), by(.)
egen max_x1=max(x1), by(.)
gen normal_x1=(x1-min_x1)/(max_x1-min_x1)

and if so, should i include the panel or time variable.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/