Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Alphabetical sort by value label


From   "Sergiy Radyakin" <[email protected]>
To   [email protected]
Subject   Re: st: Alphabetical sort by value label
Date   Tue, 16 Dec 2008 19:36:32 -0500

Dear David,

Sorry, I didn't read your code. Is this something you are trying to do?

program labels_ABC
  syntax varname
  local lab_name :value label `varlist'
  tempvar new_order
  preserve
    uselabel `lab_name', clear
    sort label
    generate `new_order' = _n
    tempfile order_dictionary
    sort value
    rename value `varlist'
    keep `varlist' `new_order'
    save `"`order_dictionary'"'
  restore
  sort `varlist'
  merge `varlist' using `"`order_dictionary'"'
  drop _merge
  sort `new_order'
end


** Demonstration
sysuse auto
label define raz 1 "odin raz" 2 "dva raza" 3 "tri raza" 4 "chetyre
raza" 5 "pyat raz"
label values rep78 raz
list
labels_ABC rep78
list

** EOF

The program should be quite self-explanatory, but let me know if
something doesn't go as supposed to.

Best regards, Sergiy Radyakin

On Tue, Dec 16, 2008 at 6:24 PM, David Elliott <[email protected]> wrote:
> I have a need to sort data by the alphabetical sort order of the value
> labels of a variable or have value labels appear in alphabetical order
> in tabs & tables.  Normally, I could -decode x,gen(y)- and -sort y- to
> do so, but in this case I have a very large dataset, long value labels
> and no room to add a string## variable.  Besides, I don't want to do
> this manually, I want to generalize the process to something like
> -labvalsort x-
>
> I've looked in various nooks and crannies of the manual, I've
> -hsearch-ed and -findit-ed without any luck.  Even Nick's redoubtable
> labutils couldn't help in this case.
>
> (1) Does anyone know if such a utility program exists?
>
> (2) If such a program does not exist, I will outline how I am
> approaching the problem on my own.
>
> In order to get a list of labels into manipulatable form, there were
> two basic approaches that I considered:
> (A) using a -levelsof- on the value labeled variable of interest and
> then looping to get the labels and sorting the resultant list
> (B) using mata's -st_vlload()- function to allow processing of the
> labels in a matrix
>
> As a mata neophyte (B) looked daunting but had a certain beauty to it
> since a simple function can accomplish the work of a loop.
>
> I created the following -labvalsort- and much to my surprise, it worked:
>
> -----------begin code - watch for wrapping--------------
> program define labvalsort
> version 9.0
>
> *! version 1.0.0  2008.12.16
> *! Alphabetical sort by value label
> *! by David C. Elliott
>
> *! syntax is labvalsort varname
> *! creates new variable prepending "_" to varname
> *! creates new value label prepending "_" to varname's vlaue label
>
> local vallab : val lab `1'
> if "`vallab'"=="" {
>        error 182
>        exit
>        }
> mata: vallabsort("`1'","`vallab'")
> qui recode `1' `rc_list',gen(_`1')
> lab val _`1' `new_lab'
> end
>
> mata:
> void vallabsort(string scalar varname , string scalar val_lab)
> {
>        string scalar new_lab, rc_list
>        string matrix lab_list, sort_lab
>        new_lab = "_" + val_lab
>        st_vlload(val_lab, values=.,text=.)
>        lab_list=(strofreal(values),text)
>        sort_lab=lab_list[.,1], sort(lab_list,2)
>        /*
>        origorder = strtoreal(sort_lab[.,1]')
>        neworder = strtoreal(sort_lab[.,2]')
>        */
> // create new value label
>        st_vlmodify(new_lab,strtoreal(sort_lab[.,1]),sort_lab[.,3])
> // loop to create recode list
>        for (i=1; i<=rows(sort_lab); i++) {
>                rc_list = rc_list + "(" + sort_lab[i,2] + "=" + sort_lab[i,1]+")"
>            }
>        st_local("rc_list",rc_list)
>        st_local("new_lab",new_lab)
>        }
> end
> ---------------------------end code----------------------------
>
> One can test this with:
>
> -----------begin code - watch for wrapping--------------
> * testing labvalsort
> sysuse nlsw88.dta
> lab list indlbl
> tab ind
> labvalsort industry
> lab list _indlbl
> tab _ind
> sort _ind
> ---------------------------end code----------------------------
>
> The -labvalsort- program creates a new variable and value label where
> the new variable is numerically sorted in the new value label's
> alphabetical order.  (Note that error checking is rudimentary at this
> point)
>
> (3) I'd like to get rid of the recode list loop. I believe it may be
> possible to do the recode from within mata and had created two vectors
> origorder and neworder (currently commented out) that I intend(ed) to
> use.  However, I don't currently see how that is possible and would
> appreciate some suggestions on how to perform the recode from within
> mata.  It may be that there isn't a -st_something- function available.
>
> While I don't anticipate coming up against macro length limits in the
> usual situation, recodes involving thousands of numbers could create
> an rc_list exceeding 65536. Staying in mata would stay away from that
> problem, I think.
>
> If there is a positive answer to (1) above, then (2) is a bit
> redundant, albeit an interesting challenge.  If someone would like to
> help with (3) it would be appreciated.  Indeed, if there is an
> alternate approach to the -labvalsort- problem, I'd enjoy the
> discussion.  If others would find  -labvalsort- of use, I'll spruce it
> up a bit with user choice of newvar name for the sorted variable.
>
> Many thanks,
>
> --
> David Elliott
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index