Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Keep value labels after -mvdecode-


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: Keep value labels after -mvdecode-
Date   Thu, 29 Oct 2009 14:13:01 -0000

I am going to imagine a program -formike- that is downstream of strings
like 

"maritalstatus 0 9" 
"ethnicity 0 99" 
"sasuser 42 99 99999" 

Where these strings come from I'll leave on one side. They are assumed
to start with a numeric variable name and continue with integer codes. 

I glanced at Mike's code but once I thought I understand the problem, I
wrote my own trying to do things at the lowest level and to be a bit
more general. 

*! 1.0.0 NJC 29 Oct 2009 
program formike 
	version 8.2 
	gettoken varname codes : 0 

	confirm numeric variable `varname' 
	numlist "`codes'", int 
	
	local lblname : value label `varname' 
	if "`lblname'" == "" { 
		di as err "`varname' does not have value labels
attached" 
		exit 498
	} 

	tokenize "`c(alpha)'" 
	local i = 1 

	qui foreach x of local codes { 
		replace `varname' = .``i'' if `varname' == `x' 
		local label : label (`lblname') `x' 
		if "`label'" == "" local label "`x'" 
		label def `lblname' .``i'' "`label'", add 
		local ++i
	}
end

This is just a stab, clearly, but note a few details: 

0. Rather than generalising this to take lots of variables at once,
you'd probably find it easier to call it repeatedly within a loop. 

1. You needn't guess at how many of .a ... .z you need, but clearly this
program will have a problem if you want more than all of them. 

2. As is explicit, the program will back off if no value labels are
attached already. 

3. If you really want -modify- rather than -add-, put it in. My instinct
is that you shouldn't want to be stomping on any existing labels for
data management of this kind. 

Nick 
n.j.cox@durham.ac.uk 

Mike Lacy

My question is not identical to the original subject but close enough 
to preserve the same subject and thread:

Elan Cohen: wrote:
 >My data consists of different types of missing values currently 
stored as >negative integers.  I'm using -mvdecode- to code them as 
missing, but I'd >like to tranfer the value labels along with it.  Is 
this possible?


My situation is a more general case, I think.  I have a listing of 
which values denote some kind of missing data situation for each of 
several hundred variables. I want to recode them to missing (.a, .b, 
...) and retain the original value labels.  I cannot rely on the same 
numerical codes being used to denote the same missing value 
situations for different variables.  So, for example, I might have:

maritalstatus 0 9

ethnicity  0 99

where the value labels for maritalstatus are 0 = "refused", and 9 = 
"not applicable because of age"; and the variable labels for 
ethnicity are 0 = "question skipped for this respondent" and 99 = 
"respondent undecided".

The desired outcome would be that for maritalstatus, 0 is recoded to 
.a, and 9 to .b, with the marital status label modified so that .a 
has the label "refused" and .b has not applicable...".  Ethnicity 
would be handled in parallel, but not identically, given the 
different meaning of 0.

The only solution I could come up with, which worked but seemed 
inconvenient, was to do the recode for each value, put the value 
label into a local, modify the label, reassign it, etc.

prog rectomiss  // handles the recode for one variable
args varname v1 v2 v3 v4 v5   //clumsy I know
local misslist = ".a .b .c .d .e" // five should be enough
tokenize `misslist'
local i = 1
foreach val of numlist `v1' `v2' `v3' `v4' `v5' {
    local misscode = ``i''
    recode `varname' (`val' = `misscode')
    local ++i
    local labelname: value label `varname'
    if "`labelname'" != "" { //blank
       local lblstrg: label(`varname') `val'
       label define `labelname' `misscode' "`lblstrg'", add
       label values `varname' `labelname'
    }
}
// sample usage
// rectomiss ethnicity 0 99
// rectomiss maritalstatus 0 99

This seemed quite a bit of messing around for a simple task, a.though 
the execution for fast. Are there simpler ways to do this? I'd find 
that instructive.

(The application, by the way, might be of interest to some U.S. 
social scientists.  The Stata version of the General Social Survey 
files are distributed with a do-file to recode all the missing values 
codes to ".", which loses the original value labels and distinctions 
for different reasons for missing values. If I had a decent simpler 
way to do what I did above, I'd pass it on to the folks that 
distribute the data.)


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index