Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: using information in value labels

From   David Kantor <>
Subject   Re: st: using information in value labels
Date   Mon, 12 Feb 2007 22:29:28 -0500

To Alex's question about recoding variables to missing.

I usually treat each variable separately, based on a knowledge of its coding scheme. Thus, I usually don't run a loop unless there is some common coding scheme used by a multitude of variables. It looks like what you want is code to handle anything that might come along.

Also, you didn't mention anything about recoding values such as 99, 999, 9999 etc., and 98, 998, 9998 etc to missing. You need to do this -- unless you want to qualify every analysis command with something like " .. if var1 <98". (I did write a program once to code the values 99, 999, 9999 etc., and 98, 998, 9998 etc -- to missing values -- distinct extended missing values, such as .a, .b, .c, etc. -- so as to not lose information about distinct meanings of, say 98 vs. 99.)

As for value labels that contain text such as "Missing", "missing", "did not respond", it may be possible to create code to look for these, but you would need a specific set of values for it to look for (either built in to the code, or as a parameter). In either case, finding an overall set of such values may not be practical. And then, once you find such text strings, what do you recode them to? You showed that you want to have them coded as . , but that can lose information that you may want to retain. It would be better to have each distinct "Missing", "missing", "did not respond", code to a distinct extended missing value. How many such text values can you expect? Are there more than 26? (There are 26 extended missing values.)

And what are the actual corresponding numeric values to these labels such as "Missing", "missing", "did not respond"? Did you know that they can be extended missing values? So if the coding was done cleverly in the first place (the value labels cleverly constructed), there would be no need for any further recoding.

To summarize, your desired program is plausible, but may not be practical.

I hope this helps; good luck.

At 09:23 PM 2/12/2007, Alex wrote:


I have a complex survey dataset which has been helpfully cleaned - each variable has its missing values coded with 9/99/999/999 or 98/998/9998 etc. and things like Missing, Did not respond, refused to answer etc. are all coded in as value labels.

I would like to treat all of these as missing and want to run a loop as follow

local allvars x1 ... x100

foreach var of local allvars{

label list `var'
*pseudo code follows
scan label to see which levels include - "Missing", "missing", "did not respond", etc
for each level `i' {
recode `var' (`i' = .)


I was wondering if this is something that could be reasonably be implemented.

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index