Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: RE: using information from value label to generate new variables


From   Nick Cox <n.j.cox@durham.ac.uk>
To   "'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu>
Subject   st: RE: RE: using information from value label to generate new variables
Date   Thu, 7 Jun 2012 19:29:09 +0100

Daniel Klein, the author of -labellist- (SSC), may want to comment, but he did exactly the right thing in defining -labellist-'s saved results. 

Evelyn's problem arise from using an equals sign in 

quietly: labellist locations
local loc_levels= r(locations_values) 

Evelyn should have gone 

quietly: labellist locations
local loc_levels `r(locations_values)' 

The equals sign evaluates the expression to its right, and that truncates it. Here copying alone is sufficient. For more on this biting beast, see 

SJ-8-4  pr0045  . . . . . . . . Stata tip 70: Beware the evaluating equal sign
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
        Q4/08   SJ 8(4):586--587                                 (no commands)
        tip explaining the pitfall of losing content in a macro
        because of limits on the length of string expressions

Nick 
n.j.cox@durham.ac.uk 

Nick Cox

You can get the joint set of levels this way: 

local all
qui forval i = 1/9 { 
	levelsof a`i', local(levels) 
	local all : list all | levels 
}

-labellist- is from SSC. 

Evelyn Ersanilli

I have a cross-sectional survey dataset. 
For question "a", people were asked to list up to 9 countries.
All variables a1-a9 (numeric, up to 3 digits) have the same value label; "locations".
Because it also attached to other variables, the value label locations does not only hold the 3 digit country codes, but also 5-digit regional codes.

For each country (eg France, Germany, Zimbabwe, etc) that was mentioned I would like to generate a variable that is  0 if that country has not been named as any of the 9 replies (many people gave fewer than 9 replies) by a respondent, and 1 if the country has been named as any of the up to 9 replies by a respondent (and missing if the respondent didn't answer a1-a9).
These variables should have the names of the country.

Building on online examples I've gotten close to what I want, but I have problem correctly & efficiently delimiting the list of newly generated variables.
I first tried to get the values and labels from the first answer (a1). However this risks omitting countries that have only been named in a2,a3 etc.
In my second attempt I therefore tried to abstracts the values and labels from the value label 'locations' using labellist and r()
The problem with Attempt 2 is that r() only saves up to (244?) characters, which is fewer that all values together and I haven't found out how to increase the storage capacity.
Ideally I would also limit the abstraction of lables&values to only the 1-3digit country codes., leaving out the 5-digit regional codes.

Any alternative suggestions  would be welcome


Here is my syntax:


*-------------------Attempt I-----------
//Step 1: abstract labels
levelsof a1, local(a1_levels)        
    foreach val of local a1_levels {   
    local c`val' : label locations `val'  
    }
macro list

//Step 2: generate dummies
foreach X of local a1_levels {  
egen var`X'=anymatch(a1 a2 a3 a4 a5 a6 a7 a8 a9), values(`X')
}  

//Step 3: label and rename
local variablelist "var"
foreach variable of local variablelist{     
	foreach value of local a1_levels{     
	label variable `variable'`value' "`c`value''"
	local stringy =strtoname("`c`value''")		//needed because some country names contain spaces or other illegitimate characters
	rename `variable'`value' `stringy'
	}
}
*-----------------------------------------

*-------------------Attempt II-----------
//Step 1: abstract labels
quietly: labellist locations
local loc_levels= r(locations_values)
	foreach val of local loc_levels {   /* loop over all values in local list `var'_levels */
    local c`val' : label locations `val'  /* create macro that contains label for each value */
    }
macro list
//etc
*-----------------------------------------





For step 2 I've also tried:
*-----------------------------------------
foreach X of numlist 2/935 {  
	egen var`X'=anymatch(a1 a2 a3 a4 a5 a6 a7 a8 a9), values(`X')
			}  
*-----------------------------------------
But that generates way too many variables as many of the values between 2 and 935 do not have a country code associated with it.
I could of course just look up all the value that were assigned a label in locations, but where's the fun in that..


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index