Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Dana Chandler <dchandler@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Replacing missing values only works one way? |

Date |
Tue, 3 Aug 2010 02:12:47 -0500 |

Thanks everybody -- Nick -- Thanks especially for your observation by observation explanation. That makes it very clear. I will most likely re-sort (though doing so is slightly problematic given the nature of the data structure (I'm doing a lot of complicated fill-ins). This once I might just keep my loop that forward replaces multiple times. That said, I'm glad that I better understand the way stata is looping through observations which causes this occurence. Best, Dana On Wed, Jul 28, 2010 at 11:50 AM, Nick Cox <n.j.cox@durham.ac.uk> wrote: > > I agree with Eric and Joseph. > > I am responsible for the FAQ mentioned which "doesn't quite make sense" (and no doubt for much else that qualifies also). > > The key material here is in its sections 3 and 4. The main idea is that -replace- looking forwards is _not_ just a reverse of the same thing backwards because Stata -replace-s according to the current dataset order. > > Let me try again with a simple example along Joseph's lines but even more explicit. > > y is missing in obs 2, 3, 4: > > Obs y > 1 1 > 2 . > 3 . > 4 . > 5 5 > > Consider first > > . replace y = y[_n-1] if missing(y) > > Now -replace- is applied to each observation in turn. > > Obs 1. missing(y) is false because y is not missing. No -replace- as the -if- condition is not satisfied. > > Obs 2. missing(y) is true. The previous value y[_n-1] namely y[1] namely 1 is used to -replace- the value in this obs. > > Obs 3. missing(y) is true. The previous value y[_n-1] namely y[2] namely 1 -- it's just been changed -- is used to -replace- the value in this obs. > > Obs 4. missing(y) is true. The previous value y[_n-1] namely y[3] namely 1 -- it's just been changed -- is used to -replace- the value in this obs. > > Obs 5. missing(y) is false because y is not missing. No -replace- as the -if- condition is not satisfied. > > What's happening here I call a cascade. You might want to substitute your own image, say that of a domino effect. > > You now have > > Obs y > 1 1 > 2 1 > 3 1 > 4 1 > 5 5 > > Now go back to the original and consider > > . replace y = y[_n+1] if missing(y) > > Note that despite the forward reference to [_n+1] Stata does not reverse the direction of processing. > > Obs 1. missing(y) is false because y is not missing. No -replace- as the -if- condition is not satisfied. > > Obs 2. missing(y) is true. The following value y[_n+1] namely y[3] namely . is used to -replace- the value in this obs. That makes no difference. > > Obs 3. missing(y) is true. The following value y[_n+1] namely y[4] namely . is used to -replace- the value in this obs. That makes no difference. > > Obs 4. missing(y) is true. The following value y[_n+1] namely y[5] namely 5 is used to replace- the value in this obs. > > Obs 5. missing(y) is false because y is not missing. No -replace- as the -if- condition is not satisfied. > > You now have > > Obs y > 1 1 > 2 . > 3 . > 4 5 > 5 5 > > There is no cascade backwards. If you want a cascade of backwards replacements, you have to reverse time, or more prosaically change the -sort- order. There is more in the FAQ. > > Nick > n.j.cox@durham.ac.uk > > Joseph McDonnell > > As Eric says, the commands do work. It's just your expectation doesn't > match what you get. Eric has probably identified the "problem", namely > a missing followed by one or more missings. Suppose rows 1 and 2 are > missing in myvar but row 3 isn't. The replace command tells Stata to > replace the value in row 1 with the value in row 2. But that's missing > as well, so row 1 doesn't change. Row 2 will be assigned row 3's value > and you'll need to do another replace to backwards propogate the value > to row 1. > > You could use "assert" to check for missingness and loop until all > missings are filled.. > > . capture assert myvar<. > . while _rc { > . replace myvar=myvar[_n+1] if myvar>=. > . capture assert myvar<. > . } > > > On Wed, Jul 28, 2010 at 6:43 AM, Dana Chandler <dchandler@gmail.com> wrote: > > > The command described here works fine when I go one way: replace myvar > > = myvar[_n-1] if myvar >= . > > > > However, when I try to replace in the other direction replace myvar = > > myvar[_n+1] if myvar >= ., it doesn't work and I have to repeat the > > command for each time I want it to copy... Does anyone have any > > suggestions? > > > > I've read the below, but it doesn't quite make sense. > > > > http://www.stata.com/support/faqs/data/missing.html > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**st: creative cover on An Introduction to Stata for Health Researchers, 3rd Edition** - Next by Date:
**st: Carry over information on time-invariant covariate to all observations of a household?** - Previous by thread:
**st: creative cover on An Introduction to Stata for Health Researchers, 3rd Edition** - Next by thread:
**st: Carry over information on time-invariant covariate to all observations of a household?** - Index(es):