Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Nick Cox <n.j.cox@durham.ac.uk> |
To | "'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu> |
Subject | st: RE: RE: using information from value label to generate new variables |
Date | Thu, 7 Jun 2012 19:29:09 +0100 |
Daniel Klein, the author of -labellist- (SSC), may want to comment, but he did exactly the right thing in defining -labellist-'s saved results. Evelyn's problem arise from using an equals sign in quietly: labellist locations local loc_levels= r(locations_values) Evelyn should have gone quietly: labellist locations local loc_levels `r(locations_values)' The equals sign evaluates the expression to its right, and that truncates it. Here copying alone is sufficient. For more on this biting beast, see SJ-8-4 pr0045 . . . . . . . . Stata tip 70: Beware the evaluating equal sign . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox Q4/08 SJ 8(4):586--587 (no commands) tip explaining the pitfall of losing content in a macro because of limits on the length of string expressions Nick n.j.cox@durham.ac.uk Nick Cox You can get the joint set of levels this way: local all qui forval i = 1/9 { levelsof a`i', local(levels) local all : list all | levels } -labellist- is from SSC. Evelyn Ersanilli I have a cross-sectional survey dataset. For question "a", people were asked to list up to 9 countries. All variables a1-a9 (numeric, up to 3 digits) have the same value label; "locations". Because it also attached to other variables, the value label locations does not only hold the 3 digit country codes, but also 5-digit regional codes. For each country (eg France, Germany, Zimbabwe, etc) that was mentioned I would like to generate a variable that is 0 if that country has not been named as any of the 9 replies (many people gave fewer than 9 replies) by a respondent, and 1 if the country has been named as any of the up to 9 replies by a respondent (and missing if the respondent didn't answer a1-a9). These variables should have the names of the country. Building on online examples I've gotten close to what I want, but I have problem correctly & efficiently delimiting the list of newly generated variables. I first tried to get the values and labels from the first answer (a1). However this risks omitting countries that have only been named in a2,a3 etc. In my second attempt I therefore tried to abstracts the values and labels from the value label 'locations' using labellist and r() The problem with Attempt 2 is that r() only saves up to (244?) characters, which is fewer that all values together and I haven't found out how to increase the storage capacity. Ideally I would also limit the abstraction of lables&values to only the 1-3digit country codes., leaving out the 5-digit regional codes. Any alternative suggestions would be welcome Here is my syntax: *-------------------Attempt I----------- //Step 1: abstract labels levelsof a1, local(a1_levels) foreach val of local a1_levels { local c`val' : label locations `val' } macro list //Step 2: generate dummies foreach X of local a1_levels { egen var`X'=anymatch(a1 a2 a3 a4 a5 a6 a7 a8 a9), values(`X') } //Step 3: label and rename local variablelist "var" foreach variable of local variablelist{ foreach value of local a1_levels{ label variable `variable'`value' "`c`value''" local stringy =strtoname("`c`value''") //needed because some country names contain spaces or other illegitimate characters rename `variable'`value' `stringy' } } *----------------------------------------- *-------------------Attempt II----------- //Step 1: abstract labels quietly: labellist locations local loc_levels= r(locations_values) foreach val of local loc_levels { /* loop over all values in local list `var'_levels */ local c`val' : label locations `val' /* create macro that contains label for each value */ } macro list //etc *----------------------------------------- For step 2 I've also tried: *----------------------------------------- foreach X of numlist 2/935 { egen var`X'=anymatch(a1 a2 a3 a4 a5 a6 a7 a8 a9), values(`X') } *----------------------------------------- But that generates way too many variables as many of the values between 2 and 935 do not have a country code associated with it. I could of course just look up all the value that were assigned a label in locations, but where's the fun in that.. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/