Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Replacing missing values only works one way?


From   Dana Chandler <[email protected]>
To   [email protected]
Subject   Re: st: Replacing missing values only works one way?
Date   Tue, 3 Aug 2010 02:12:47 -0500

Thanks everybody --

Nick -- Thanks especially for your observation by observation
explanation. That makes it very clear.

I will most likely re-sort (though doing so is slightly problematic
given the nature of the data structure (I'm doing a lot of complicated
fill-ins). This once I might just keep my loop that forward replaces
multiple times. That said, I'm glad that I better understand the way
stata is looping through observations which causes this occurence.

Best,
Dana


On Wed, Jul 28, 2010 at 11:50 AM, Nick Cox <[email protected]> wrote:
>
> I agree with Eric and Joseph.
>
> I am responsible for the FAQ mentioned which "doesn't quite make sense" (and no doubt for much else that qualifies also).
>
> The key material here is in its sections 3 and 4. The main idea is that -replace- looking forwards is _not_ just a reverse of the same thing backwards because Stata -replace-s according to the current dataset order.
>
> Let me try again with a simple example along Joseph's lines but even more explicit.
>
> y is missing in obs 2, 3, 4:
>
> Obs    y
> 1      1
> 2      .
> 3      .
> 4      .
> 5      5
>
> Consider first
>
> . replace y = y[_n-1] if missing(y)
>
> Now -replace- is applied to each observation in turn.
>
> Obs 1. missing(y) is false because y is not missing. No -replace- as the -if- condition is not satisfied.
>
> Obs 2. missing(y) is true. The previous value y[_n-1] namely y[1] namely 1 is used to -replace- the value in this obs.
>
> Obs 3. missing(y) is true. The previous value y[_n-1] namely y[2] namely 1 -- it's just been changed -- is used to -replace- the value in this obs.
>
> Obs 4. missing(y) is true. The previous value y[_n-1] namely y[3] namely 1 -- it's just been changed -- is used to -replace- the value in this obs.
>
> Obs 5. missing(y) is false because y is not missing. No -replace- as the -if- condition is not satisfied.
>
> What's happening here I call a cascade. You might want to substitute your own image, say that of a domino effect.
>
> You now have
>
> Obs    y
> 1      1
> 2      1
> 3      1
> 4      1
> 5      5
>
> Now go back to the original and consider
>
> . replace y = y[_n+1] if missing(y)
>
> Note that despite the forward reference to [_n+1] Stata does not reverse the direction of processing.
>
> Obs 1. missing(y) is false because y is not missing. No -replace- as the -if- condition is not satisfied.
>
> Obs 2. missing(y) is true. The following value y[_n+1] namely y[3] namely . is used to -replace- the value in this obs. That makes no difference.
>
> Obs 3. missing(y) is true. The following value y[_n+1] namely y[4] namely . is used to -replace- the value in this obs. That makes no difference.
>
> Obs 4. missing(y) is true. The following value y[_n+1] namely y[5] namely 5 is used to replace- the value in this obs.
>
> Obs 5. missing(y) is false because y is not missing. No -replace- as the -if- condition is not satisfied.
>
> You now have
>
> Obs    y
> 1      1
> 2      .
> 3      .
> 4      5
> 5      5
>
> There is no cascade backwards. If you want a cascade of backwards replacements, you have to reverse time, or more prosaically change the -sort- order. There is more in the FAQ.
>
> Nick
> [email protected]
>
> Joseph McDonnell
>
> As Eric says, the commands do work. It's just your expectation doesn't
> match what you get. Eric has probably identified the "problem", namely
> a missing followed by one or more missings. Suppose rows 1 and 2 are
> missing in myvar but row 3 isn't. The replace command tells Stata to
> replace the value in row 1 with the value in row 2. But that's missing
> as well, so row 1 doesn't change. Row 2 will be assigned row 3's value
> and you'll need to do another replace to backwards propogate the value
> to row 1.
>
> You could use "assert" to check for missingness and loop until all
> missings are filled..
>
> . capture assert myvar<.
> . while _rc {
> . replace myvar=myvar[_n+1] if myvar>=.
> . capture assert myvar<.
> . }
>
>
> On Wed, Jul 28, 2010 at 6:43 AM, Dana Chandler <[email protected]> wrote:
>
> > The command described here works fine when I go one way: replace myvar
> > = myvar[_n-1] if myvar >= .
> >
> > However, when I try to replace in the other direction replace myvar =
> > myvar[_n+1] if myvar >= ., it doesn't work and I have to repeat the
> > command for each time I want it to copy... Does anyone have any
> > suggestions?
> >
> > I've read the below, but it doesn't quite make sense.
> >
> > http://www.stata.com/support/faqs/data/missing.html
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index