Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: create local macros named after the values of a categorial variable


From   Oliver Jones <ojones@wiwi.uni-bielefeld.de>
To   statalist@hsphsun2.harvard.edu
Subject   st: create local macros named after the values of a categorial variable
Date   Thu, 12 Aug 2010 15:23:03 +0200

Hi Statalist,

while working on a programming problem a came in need
of creating a set of local macros that are
 i) named after the values of a categorial variable,
    e.g. city codes
ii) contain the integer values from 1 to N, where N is
    the number of distinct city codes

I worked out to alternatives that do the job, but would
very much appreciate to get any feedback telling me if
there exists some third alternative that is faster
(in terms of computing time).

Here is an example of what I do:
-------------begin example-----------------------------
clear
set obs 3
local varname = "city_code"
gen `varname' = _n
expand = 4
list, abbreviate(9)

preserve
******
*
* Alternative 1 (sort and keep)
display "Start alternative 1"
sort `varname'
by `varname': gen byte index = _n
keep if index == 1
quietly count
forvalues i = 1/`r(N)' {
	local macro_name = `varname'[`i']
	local `macro_name' = `i'
	display "For City Code " `varname'[`i'] " the" ///
		" value of the local macro is ``macro_name''"
}
display "end"
* end
***

restore
******
*
* Alternative 2 (levelsof)
display "Start alternative 2"
levelsof `varname'
tokenize `r(levels)'
* Now we got one local macro for each of the
* city_codes, named 1, 2, ... , N.
* But we want to exchange the names
* of the macros by their values (the city_codes)
* and vice versa.

* get the number of different city_codes
local N_citys = 1
while "``N_citys''" != "" {
	local ++N_citys
}
local --N_citys
dis "{text}Number of distinct codes: {result}`N_citys'"

forvalues i = 1/`N_citys'{
	local new_mac_name = ``i''
	local `new_mac_name' = `i'
	local `i'	//delete the macros we no longer need
	display "For City Code " `new_mac_name' " the" ///
		" value of the local macro is {result}``new_mac_name''"
}
display "end"
* end
***
---------------end example-----------------------------

The datasets for which this is needed are quite big: 30 million observations
each year and I got datasets for 24 years.
The number of macros to be created amounts to approx. 330, i.e. 330 different
city codes. But this number is not constant in each year.

So far alternative 1 (sort and keep) needs about 6 minutes to run and is
outperformed by alternative 2 (levelsof) which need about 1 minutes.

Kind regards
Oliver

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index