Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Oliver Jones <ojones@wiwi.uni-bielefeld.de> |
To | statalist@hsphsun2.harvard.edu |
Subject | st: create local macros named after the values of a categorial variable |
Date | Thu, 12 Aug 2010 15:23:03 +0200 |
Hi Statalist, while working on a programming problem a came in need of creating a set of local macros that are i) named after the values of a categorial variable, e.g. city codes ii) contain the integer values from 1 to N, where N is the number of distinct city codes I worked out to alternatives that do the job, but would very much appreciate to get any feedback telling me if there exists some third alternative that is faster (in terms of computing time). Here is an example of what I do: -------------begin example----------------------------- clear set obs 3 local varname = "city_code" gen `varname' = _n expand = 4 list, abbreviate(9) preserve ****** * * Alternative 1 (sort and keep) display "Start alternative 1" sort `varname' by `varname': gen byte index = _n keep if index == 1 quietly count forvalues i = 1/`r(N)' { local macro_name = `varname'[`i'] local `macro_name' = `i' display "For City Code " `varname'[`i'] " the" /// " value of the local macro is ``macro_name''" } display "end" * end *** restore ****** * * Alternative 2 (levelsof) display "Start alternative 2" levelsof `varname' tokenize `r(levels)' * Now we got one local macro for each of the * city_codes, named 1, 2, ... , N. * But we want to exchange the names * of the macros by their values (the city_codes) * and vice versa. * get the number of different city_codes local N_citys = 1 while "``N_citys''" != "" { local ++N_citys } local --N_citys dis "{text}Number of distinct codes: {result}`N_citys'" forvalues i = 1/`N_citys'{ local new_mac_name = ``i'' local `new_mac_name' = `i' local `i' //delete the macros we no longer need display "For City Code " `new_mac_name' " the" /// " value of the local macro is {result}``new_mac_name''" } display "end" * end *** ---------------end example----------------------------- The datasets for which this is needed are quite big: 30 million observations each year and I got datasets for 24 years. The number of macros to be created amounts to approx. 330, i.e. 330 different city codes. But this number is not constant in each year. So far alternative 1 (sort and keep) needs about 6 minutes to run and is outperformed by alternative 2 (levelsof) which need about 1 minutes. Kind regards Oliver * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/