Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: RE: AW: RE: AW: recode 9, 99, 999,..., into missing


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: RE: AW: RE: AW: recode 9, 99, 999,..., into missing
Date   Mon, 17 May 2010 18:26:39 +0100

None of these statements is entirely correct. 

In the first clause, the maximum can be less than 9. 

In the second and third clauses, the test is whether the maximum falls in certain intervals. The range of the data is otherwise not considered. 

The last clause isn't unconditional as in your paraphrase; it applies only when r(max) exceeds 999. 

Nick 
n.j.cox@durham.ac.uk 

Michael McCulloch

Thanks Nick, this helped me understand the code. Am I correct then to  
understand that:
. if r(max)<=9 mvdecode `var', mv(9)
  	means: "change all values of 9 to missing when 9 is the max of the  
range"

. else if inrange(r(max),10,99) mvdecode `var', mv(99)
	means: "change all values of 99 to missing when the range is 10 to 99"

. else if inrange(r(max),100,999) mvdecode `var', mv(999)
	means: "change all values of 999 to missing when the range is 100 to  
999"

. else mvdecode `var', mv(9999)
	means: "change all values of 9999 to missing"

On May 17, 2010, at 9:58 AM, Nick Cox wrote:

> 99 isn't changed because there are bigger values in the same  
> variable. Thus, it is assumed that it does not mean missing.

Michael McCulloch

> In Martin's code, I noticed that:
> 	for observation #8, var4 is changed to missing,
> 	for observation #4, var3 is not changed to missing.
> This puzzled me because they both have "999" as original value.
>
> It also looks like values "9", "999" and "9999" are changed to
> missing, but not "99".
> Michael
>
> On May 17, 2010, at 9:30 AM, Lachenbruch, Peter wrote:
>
>> Looks good to me.
>>
>> Tony
>>
>> Peter A. Lachenbruch
>> Department of Public Health
>> Oregon State University
>> Corvallis, OR 97330
>> Phone: 541-737-3832
>> FAX: 541-737-4001
>>
>> -----Original Message-----
>> From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu
>> ] On Behalf Of Martin Weiss
>> Sent: Monday, May 17, 2010 12:35 AM
>> To: statalist@hsphsun2.harvard.edu
>> Subject: AW: st: RE: AW: RE: AW: recode 9, 99, 999,..., into missing
>>
>>
>> <>
>>
>> What does the -mvdecode- solution look like then? Like this?
>>
>>
>>
>> *************
>> clear*
>>
>> inp byte(var1 var2) int(var3 var4)
>> 1 1 1 1
>> 2 2 2 2
>> 3 3 3 3
>> 4 8 99 999
>> 5 9 100 1000
>> 6 10 101 1001
>> 7 11 150 5000
>> 9 12 999 9999
>> end
>>
>> foreach var of varlist *{
>> 	sum `var', mean
>> 	if r(max)<=9 mvdecode `var', mv(9)
>> 	else if inrange(r(max),10,99) mvdecode `var', mv(99)
>> 	else if inrange(r(max),100,999) mvdecode `var', mv(999)
>> 	else mvdecode `var', mv(9999)
>> }
>>
>> li, noo
>> *************
>>
>>
>>
>> HTH
>> Martin
>>
>>
>> -----Ursprüngliche Nachricht-----
>> Von: owner-statalist@hsphsun2.harvard.edu
>> [mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von Steve
>> Samuels
>> Gesendet: Montag, 17. Mai 2010 03:00
>> An: statalist@hsphsun2.harvard.edu
>> Betreff: Re: st: RE: AW: RE: AW: recode 9, 99, 999,..., into missing
>>
>> Mandy, if you know this much about each variable, I see no advantaqe
>> or necessity to your approach.  -mvdecode- appears to be superior in
>> every way.  It is not only more direct,  clearer, and  will  handle
>> all the other "non-data" codes. Clarity is very important: other
>> people (and you, perhaps, in the future) will be able to understand
>> your Stata statements without any lengthy explanation.  None of the
>> other solutions can claim that.
>>
>> Steve
>>
>>
>>
>> On Sun, May 16, 2010 at 8:33 PM, Amanda Fu <mandy.fu1@gmail.com>
>> wrote:
>>> Dear Mr. Weiss and Lachenbruch,
>>>
>>> I am sorry that I should be more clear when describing my question.
>>> In
>>> my opinion, I need to be careful about this problem : for example,
>>> for
>>> a variable  that has 10 scales, the 9 value means a real scale and  
>>> 99
>>> in that case means "not answered".
>>>
>>> The pattern is like this:
>>> (1) if the maximum value  of a variable is smaller than 9 , then the
>>> "not answered" takes the value 9;
>>> (2) if the maximum value  of a variable is smaller than 99 but
>>> greater
>>> than 10, then the "not answered"   takes the value 99;
>>> (3) if the maximum value  of a variable is smaller than 999 but
>>> greater than 100, then the "not answered"  takes the value 999;
>>> and so on.
>>>
>>> (And you are absolutely right for the reminder that there are values
>>> such as 7,8, 98, or 97 to indicate "refused to answer" "invalid
>>> answer". Here I would like to keep focus on one example of "not
>>> answered" , because the other values could be dealt with using the
>>> same way.)
>>>
>>> Thanks for help from both of you!

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index