
Title | Ordering the results of a tabulation of a string variable | |
Author | William Gould, StataCorp |
Someone has a string variable containing
MC = most competitive HC = highly competitive VC = Very competitive C = Competitive LC = Less Competitive NC = Non-Competitive
If you tabulate the string variable, the results are presented in alphabetical order:
. tabulate comp comp | Freq. Percent Cum. ------------+----------------------------------- C | 41 20.50 20.50 HC | 35 17.50 38.00 LC | 27 13.50 51.50 MC | 34 17.00 68.50 NC | 34 17.00 85.50 VC | 29 14.50 100.00 ------------+----------------------------------- Total | 200 100.00
Here is how you order the variables:
. label define order 1 MC 2 HC 3 VC 4 C 5 LC 6 NC . encode comp, gen(c2) label(order)
That is, encode will use a predefined value label if you tell it to do that. Now when I tabulate new variable c2, I get the results ordered from high to low:
. tabulate c2 c2 | Freq. Percent Cum. ------------+----------------------------------- MC | 34 17.00 17.00 HC | 35 17.50 34.50 VC | 29 14.50 49.00 C | 41 20.50 69.50 LC | 27 13.50 83.00 NC | 34 17.00 100.00 ------------+----------------------------------- Total | 200 100.00
If I wanted them ordered from low to high, I would just define my numeric coding differently before encode:
. label define order2 1 NC 2 LC 3 C 4 VC 5 HC 6 MC . encode comp, gen(c3) label(order2) . tab c3 c3 | Freq. Percent Cum. ------------+----------------------------------- NC | 34 17.00 17.00 LC | 27 13.50 30.50 C | 41 20.50 51.00 VC | 29 14.50 65.50 HC | 35 17.50 83.00 MC | 34 17.00 100.00 ------------+----------------------------------- Total | 200 100.00