Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Partitioning a Categorical Variable Based on Frequencies

From   n j cox <>
Subject   Re: st: Partitioning a Categorical Variable Based on Frequencies
Date   Mon, 22 Aug 2005 12:36:39 +0100

Angela James

> I'm trying to partition a categorical variable into classes based on
> the observed frequency for each category. That is, I have about 800
> companies that I'd like to group into 1) "large," 2) "medium", and 3)
> "small" companies based on the observed number (frequency) of
> employees for each company. Can anyone help me locate the appropriate > command to do this?

It sounds to me as if you are trying to create a categorical variable
from a counted one..

Suppose your cut-offs are >= 1000 employees for large, >= 100 for medium.

gen cat_size = cond(size < 100, 1, cond(size < 1000, 2, cond(size < .,
3, .)))

which goes all on one line.


gen cat_size = 1 if size < 100
replace cat_size = 2 if size < 1000
replace cat_size = 3 if size < .

label def cat_size 1 "small" 2 "medium" 3 "large"
label val size size

> Also, I need to rank the largest 40 or so companies by any number of
> criteria -- % female, % with employees over 40 years of age, etc.
> I've tried using the rank function with egen, but it simply ranks the
> companies according to the value for each (which is derived from their
> alphabetical, sequential ordering after I encoded the variable).
> Again, what is the easiest way to incorporate the observed frequency
> of different types of employees for each company into these analyses?

I have no real idea of what your problem is here. These criteria are
all numeric and -egen, rank()- should work fine. I don't know what
you are encoding here, but whatever you are holding in a string variable
sounds irrelevant to ranking.

You'll need to say more stating exactly what you actually typed
(statndard advice in Statalist FAQ).

* For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index