Search
   >> Home >> Resources & support >> FAQs >> Ordering the results of a tabulation of a string variable

How can I produce a tabulation of a string variable that is listed in logical rather than alphabetical order?

Title   Ordering the results of a tabulation of a string variable
Author William Gould, StataCorp
Date July 1997

Someone has a string variable containing

        MC  =  most competitive
        HC  =  highly competitive 
        VC  =  Very competitive
        C   =  Competitive
        LC  =  Less Competitive
        NC  =  Non-Competitive

If you tabulate the string variable, the results are presented in alphabetical order:

 . tabulate comp

        comp |      Freq.     Percent        Cum.
 ------------+-----------------------------------
           C |         41       20.50       20.50
          HC |         35       17.50       38.00
          LC |         27       13.50       51.50
          MC |         34       17.00       68.50
          NC |         34       17.00       85.50
          VC |         29       14.50      100.00
 ------------+-----------------------------------
       Total |        200      100.00

Here is how you order the variables:

 . label define order  1 MC  2 HC  3 VC  4 C  5 LC  6 NC

 . encode comp, gen(c2) label(order)

That is, encode will use a predefined value label if you tell it to do that. Now when I tabulate new variable c2, I get the results ordered from high to low:

 . tabulate c2

          c2 |      Freq.     Percent        Cum.
 ------------+-----------------------------------
          MC |         34       17.00       17.00
          HC |         35       17.50       34.50
          VC |         29       14.50       49.00
           C |         41       20.50       69.50
          LC |         27       13.50       83.00
          NC |         34       17.00      100.00
 ------------+-----------------------------------
       Total |        200      100.00

If I wanted them ordered from low to high, I would just define my numeric coding differently before encode:

 . label define order2  1 NC  2 LC  3 C  4 VC  5 HC  6 MC

 . encode comp, gen(c3) label(order2)

 . tab c3

          c3 |      Freq.     Percent        Cum.
 ------------+-----------------------------------
          NC |         34       17.00       17.00
          LC |         27       13.50       30.50
           C |         41       20.50       51.00
          VC |         29       14.50       65.50
          HC |         35       17.50       83.00
          MC |         34       17.00      100.00
 ------------+-----------------------------------
       Total |        200      100.00
The Stata Blog: Not Elsewhere Classified Find us on Facebook Follow us on Twitter LinkedIn Google+ Watch us on YouTube