Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Decode categorical variable based on frequency


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: Decode categorical variable based on frequency
Date   Wed, 30 Jul 2008 19:51:50 +0100

Yes. 

bysort rs1010: gen freq = _N 
egen group = group(freq rs1010)
replace group = group - 1 
labmask group, values(rs1010)

-search labmask- for locations. 

Nick
n.j.cox@durham.ac.uk 

Bhoom Suktitipat

Is there a possible way to automatically code a categorical variable
based on its frequency?

For example,

my variable contains ( a string variable)

. tab rs1010
 rs1010 |      Freq.     Percent        Cum.
------------+-----------------------------------
        1/1 |        353       64.77       64.77
        1/3 |        163       29.91       94.68
        3/3 |         29        5.32      100.00
------------+-----------------------------------
      Total |        545      100.00


If I use decode rs1010, gen(newrs) I got 1/1 as 0 instead.
Basically, I want to encode 3/3 (lowest frequency category) as 0, 1/3
as 1, and 1/1 as 2.


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index