Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Replacing missing values only works one way?


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: Replacing missing values only works one way?
Date   Wed, 28 Jul 2010 17:50:20 +0100

I agree with Eric and Joseph. 

I am responsible for the FAQ mentioned which "doesn't quite make sense" (and no doubt for much else that qualifies also). 

The key material here is in its sections 3 and 4. The main idea is that -replace- looking forwards is _not_ just a reverse of the same thing backwards because Stata -replace-s according to the current dataset order. 

Let me try again with a simple example along Joseph's lines but even more explicit. 

y is missing in obs 2, 3, 4: 

Obs    y
1      1
2      . 
3      . 
4      .  
5      5

Consider first 

. replace y = y[_n-1] if missing(y) 

Now -replace- is applied to each observation in turn. 

Obs 1. missing(y) is false because y is not missing. No -replace- as the -if- condition is not satisfied. 

Obs 2. missing(y) is true. The previous value y[_n-1] namely y[1] namely 1 is used to -replace- the value in this obs. 

Obs 3. missing(y) is true. The previous value y[_n-1] namely y[2] namely 1 -- it's just been changed -- is used to -replace- the value in this obs. 

Obs 4. missing(y) is true. The previous value y[_n-1] namely y[3] namely 1 -- it's just been changed -- is used to -replace- the value in this obs.

Obs 5. missing(y) is false because y is not missing. No -replace- as the -if- condition is not satisfied.

What's happening here I call a cascade. You might want to substitute your own image, say that of a domino effect.

You now have

Obs    y
1      1
2      1 
3      1 
4      1  
5      5

Now go back to the original and consider 

. replace y = y[_n+1] if missing(y)

Note that despite the forward reference to [_n+1] Stata does not reverse the direction of processing. 

Obs 1. missing(y) is false because y is not missing. No -replace- as the -if- condition is not satisfied. 

Obs 2. missing(y) is true. The following value y[_n+1] namely y[3] namely . is used to -replace- the value in this obs. That makes no difference. 

Obs 3. missing(y) is true. The following value y[_n+1] namely y[4] namely . is used to -replace- the value in this obs. That makes no difference.

Obs 4. missing(y) is true. The following value y[_n+1] namely y[5] namely 5 is used to replace- the value in this obs.

Obs 5. missing(y) is false because y is not missing. No -replace- as the -if- condition is not satisfied.

You now have

Obs    y
1      1
2      . 
3      . 
4      5  
5      5

There is no cascade backwards. If you want a cascade of backwards replacements, you have to reverse time, or more prosaically change the -sort- order. There is more in the FAQ. 

Nick 
n.j.cox@durham.ac.uk 

Joseph McDonnell

As Eric says, the commands do work. It's just your expectation doesn't
match what you get. Eric has probably identified the "problem", namely
a missing followed by one or more missings. Suppose rows 1 and 2 are
missing in myvar but row 3 isn't. The replace command tells Stata to
replace the value in row 1 with the value in row 2. But that's missing
as well, so row 1 doesn't change. Row 2 will be assigned row 3's value
and you'll need to do another replace to backwards propogate the value
to row 1.

You could use "assert" to check for missingness and loop until all
missings are filled..

. capture assert myvar<.
. while _rc {
. replace myvar=myvar[_n+1] if myvar>=.
. capture assert myvar<.
. }


On Wed, Jul 28, 2010 at 6:43 AM, Dana Chandler <dchandler@gmail.com> wrote:

> The command described here works fine when I go one way: replace myvar
> = myvar[_n-1] if myvar >= .
>
> However, when I try to replace in the other direction replace myvar =
> myvar[_n+1] if myvar >= ., it doesn't work and I have to repeat the
> command for each time I want it to copy... Does anyone have any
> suggestions?
>
> I've read the below, but it doesn't quite make sense.
>
> http://www.stata.com/support/faqs/data/missing.html

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index