Jason's idea is easier than he implies. You
can go
egen std_mpg = std(mpg)
and then use any rounding you like, e.g.
gen class_mpg = round(std_mpg)
gen class_mpg = floor(std_mpg)
gen class_mpg = ceil(std_mpg)
Recoding to integers 1 up seems superfluous
as the integer codes have a meaning already.
In the case of income, working with log(income)
or quantiles would seem more likely to be
helpful.
Nick
n.j.cox@durham.ac.uk
Jason Yackee
> I would use standard deviations to create the categorical
> variable. Say you wanted six categories -- "1" is greater
> than negative 2 stdev from the the mean, "2" is any
> observation between negative 1 and negative 2 stdev from the
> mean, 3 is between 0 and negative 1 stdevs, 4 is between 0
> and positive 1 stdevs, and so on.
>
> The only thing "by hand" is calculating the descriptive
> statistics, and recoding the variable six times.
Mentzakis, Emmanouil
> > I do not want the variable to "become more like a normal
> distribution".
> > What I would like is the categories created to be such that
> the tails
> > contain less individuals, with an increase in the numbers as we get
> > closer to the middle category.
Mentzakis, Emmanouil
> >> I have a continous variable (i.e. income) and I would like to
> >> transform it into a categorical one (e.g 5 categories/levels or
> more).
> >>
> >> I would like to ask if there is any way that I can ask stata to
> create
> >
> >> this variable deciding the appropriate cut-off points automatically
> so
> >
> >> that the categories follow aproximately a normal
> distrubution or they
> >> are of equal size.
> >
> > For the latter have a look at -help egen- and look at the cut
> function.
> > For the former: how would you expect a variable to become
> more like a
> > normal distribution by making it coarser?
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/