Stata: Data Analysis and Statistical Software
   >> Home >> Resources & support >> FAQs >> Ordering the results of a tabulation of a string variable

How can I produce a tabulation of a string variable that is listed in logical rather than alphabetical order?

Title   Ordering the results of a tabulation of a string variable
Author William Gould, StataCorp
Date July 1997

Someone has a string variable containing

        MC  =  most competitive
        HC  =  highly competitive 
        VC  =  Very competitive
        C   =  Competitive
        LC  =  Less Competitive
        NC  =  Non-Competitive

If you tabulate the string variable, the results are presented in alphabetical order:

 . tabulate comp

        comp |      Freq.     Percent        Cum.
 ------------+-----------------------------------
           C |         41       20.50       20.50
          HC |         35       17.50       38.00
          LC |         27       13.50       51.50
          MC |         34       17.00       68.50
          NC |         34       17.00       85.50
          VC |         29       14.50      100.00
 ------------+-----------------------------------
       Total |        200      100.00

Here is how you order the variables:

 . label define order  1 MC  2 HC  3 VC  4 C  5 LC  6 NC

 . encode comp, gen(c2) label(order)

That is, encode will use a predefined value label if you tell it to do that. Now when I tabulate new variable c2, I get the results ordered from high to low:

 . tabulate c2

          c2 |      Freq.     Percent        Cum.
 ------------+-----------------------------------
          MC |         34       17.00       17.00
          HC |         35       17.50       34.50
          VC |         29       14.50       49.00
           C |         41       20.50       69.50
          LC |         27       13.50       83.00
          NC |         34       17.00      100.00
 ------------+-----------------------------------
       Total |        200      100.00

If I wanted them ordered from low to high, I would just define my numeric coding differently before encode:

 . label define order2  1 NC  2 LC  3 C  4 VC  5 HC  6 MC

 . encode comp, gen(c3) label(order2)

 . tab c3

          c3 |      Freq.     Percent        Cum.
 ------------+-----------------------------------
          NC |         34       17.00       17.00
          LC |         27       13.50       30.50
           C |         41       20.50       51.00
          VC |         29       14.50       65.50
          HC |         35       17.50       83.00
          MC |         34       17.00      100.00
 ------------+-----------------------------------
       Total |        200      100.00
Bookmark and Share 
FAQs
What's new?
Statistics
Data management
Graphics
Programming Stata
Mata
Resources
Internet capabilities
Stata for Windows
Stata for Unix
Stata for Mac
Technical support
Like us on Facebook Follow us on Twitter Follow us on LinkedIn Google+ Watch us on YouTube
Follow us
© Copyright 1996–2013 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index   |   View mobile site