Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Replace


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   st: RE: Replace
Date   Fri, 21 Nov 2003 19:56:51 -0000

Adrian de la Garza

> I need to replace some values of a variable that meet certain
> conditions. Although I know it's not possible to make
> direct references
> to rows using the -replace- command, this is what I want to do:
>
> replace status = status[`j'] if status[`j'] == "NM" |
> status[`j'] == "W"
> in `i'

This is perfectly legal, in itself, so long as the local
macro makes sense in this context, i.e. depending on
what local macro j contains. However, it's not a form
that seems needed very often.

> I tried to use a local variable to avoid using status[`j'] in the
> command line but it didn't work. Like this:
>
> 					local sta = status[`j']
> 					replace status = "`sta'" if
> status[`j']=="NM"|status[`j']=="W" in `i'

No, this just puts the _text_ in that position, not the _value_.

> You will find my whole script below.
>
> Do you know how I can get past this problem?
>
> Here I present a chunk of my dataset so that you understand
> the problem
> better. You don't need to read the rest if you already
> understood what I
> want to do from the command line above.
>
> The data is organized in a panel format by country-date.
> What I need is
> to fill in the values of 'status' whenever they are missing and they
> meet certain conditions.
>
> Suppose I start a loop that sweeps through 'status' row by row. Most
> observations in 'status' are missing and suddenly, in
> observation [i] I
> find one cell that is not missing.
> (a) If status[i] = "M"/"W"/"NM", the subsequent rows
> (observations [j],
> where j gets different values) should be "M"/"W"/"NM" whenever
> country[i]=country[j] and date1[i]=date1[j].
>
> (b) If, however, status[i] = "PC" then I need to look at the next
> non-empty cell to decide what "PC" is going to be in [i].
> PC is going to
> be converted into "M" except if a subsequent value is "W" or "NM". A
> value in row [j] is considered subsequent if [j] > [i] and
> country[i]=country[j] and date1[i]=date1[j].
>
> The data looks like this:
>
>                country    date      date1    date2     status
>   401. |       Armenia    1997m5    1995m6   1999m12          |
>   402. |       Armenia    1997m6    1996m2   1999m12       PC |
>   403. |       Armenia    1997m7    1996m2   1999m12          |
>   404. |       Armenia    1997m8    1996m2   1999m12          |
>   405. |       Armenia    1997m9    1996m2   1999m12          |
>        |------------------------------------------------------|
>   406. |       Armenia   1997m10    1996m2   1999m12          |
>   407. |       Armenia   1997m11    1996m2   1999m12          |
>   408. |       Armenia   1997m12    1996m2   1999m12          |
>   409. |       Armenia    1998m1    1996m2   1999m12          |
>   410. |       Armenia    1998m2    1996m2   1999m12        W |
>        |------------------------------------------------------|
>   411. |       Armenia    1998m3    1996m2   1999m12          |
>   412. |       Armenia    1998m4    1996m2   1999m12          |
>   413. |       Armenia    1998m5    1996m2   1999m12          |
>   414. |       Armenia    1998m6    1996m2   1999m12          |
>   415. |       Armenia    1998m7    1996m2   1999m12          |
>        |------------------------------------------------------|
>   416. |       Armenia    1998m8    1996m2   1999m12          |
>   417. |       Armenia    1998m9    1996m2   1999m12          |
>   418. |       Armenia   1998m10    1996m2   1999m12          |
>   419. |       Armenia   1998m11    1996m2   1999m12          |
>   420. |       Armenia   1998m12    1996m2   1999m12       PC |
>        |------------------------------------------------------|
>   421. |       Armenia    1999m1    1996m2   1999m12          |
>   422. |       Armenia    1999m2    1996m2   1999m12          |
>   423. |       Armenia    1999m3    1996m2   1999m12          |
>   424. |       Armenia    1999m4    1996m2   1999m12          |
>   425. |       Armenia    1999m5    1996m2   1999m12          |
>        |------------------------------------------------------|
>   426. |       Armenia    1999m6    1996m2   1999m12          |
>   427. |       Armenia    1999m7    1996m2   1999m12          |
>   428. |       Armenia    1999m8    1996m2   1999m12          |
>   429. |       Armenia    1999m9    1996m2   1999m12          |
>   430. |       Armenia   1999m10    1996m2   1999m12        M |
>        |------------------------------------------------------|
>   431. |       Armenia   1999m11    1996m2   1999m12          |
>   432. |       Armenia   1999m12    1996m2   1999m12          |
>   433. |       Armenia    2000m1         .         .          |
>   434. |       Armenia    2000m2         .         .          |
>
> And after running my script it should look like this:
>
>                country    date      date1    date2     status
>   401. |       Armenia    1997m5    1995m6   1999m12          |
>   402. |       Armenia    1997m6    1996m2   1999m12        W |
>   403. |       Armenia    1997m7    1996m2   1999m12        W |
>   404. |       Armenia    1997m8    1996m2   1999m12        W |
>   405. |       Armenia    1997m9    1996m2   1999m12        W |
>        |------------------------------------------------------|
>   406. |       Armenia   1997m10    1996m2   1999m12        W |
>   407. |       Armenia   1997m11    1996m2   1999m12        W |
>   408. |       Armenia   1997m12    1996m2   1999m12        W |
>   409. |       Armenia    1998m1    1996m2   1999m12        W |
>   410. |       Armenia    1998m2    1996m2   1999m12        W |
>        |------------------------------------------------------|
>   411. |       Armenia    1998m3    1996m2   1999m12        W |
>   412. |       Armenia    1998m4    1996m2   1999m12        W |
>   413. |       Armenia    1998m5    1996m2   1999m12        W |
>   414. |       Armenia    1998m6    1996m2   1999m12        W |
>   415. |       Armenia    1998m7    1996m2   1999m12        W |
>        |------------------------------------------------------|
>   416. |       Armenia    1998m8    1996m2   1999m12        W |
>   417. |       Armenia    1998m9    1996m2   1999m12        W |
>   418. |       Armenia   1998m10    1996m2   1999m12        W |
>   419. |       Armenia   1998m11    1996m2   1999m12        W |
>   420. |       Armenia   1998m12    1996m2   1999m12        M |
>        |------------------------------------------------------|
>   421. |       Armenia    1999m1    1996m2   1999m12        M |
>   422. |       Armenia    1999m2    1996m2   1999m12        M |
>   423. |       Armenia    1999m3    1996m2   1999m12        M |
>   424. |       Armenia    1999m4    1996m2   1999m12        M |
>   425. |       Armenia    1999m5    1996m2   1999m12        M |
>        |------------------------------------------------------|
>   426. |       Armenia    1999m6    1996m2   1999m12        M |
>   427. |       Armenia    1999m7    1996m2   1999m12        M |
>   428. |       Armenia    1999m8    1996m2   1999m12        M |
>   429. |       Armenia    1999m9    1996m2   1999m12        M |
>   430. |       Armenia   1999m10    1996m2   1999m12        M |
>        |------------------------------------------------------|
>   431. |       Armenia   1999m11    1996m2   1999m12        M |
>   432. |       Armenia   1999m12    1996m2   1999m12        M |
>   433. |       Armenia    2000m1         .         .          |
>   434. |       Armenia    2000m2         .         .          |
>
> This is my script and I don't know what the problem is:
>
>       local m = _N
> 	forv i = 1/`m' {
> 		if status[`i'] != "" {
> 			if status[`i'] == "PC" {
> 				local j = `i'+1
> 				while status[`j'] == "" &
> country[`j']==country[`i'] & date1[`i']==date1[`j'] {
> 					local sta = status[`j']
> 					replace status = "`sta'" if
> status[`j']=="NM"|status[`j']=="W" in `i'
> 					replace status = "M" if
> status[`j']=="M"|(date1[`j']!=date1[`i'] &
> country[`j']!=country[`i'] in
> `i'
> 				}
> 			}
> 			else {
> 				local j = `i'+1
> 				local k = 0
> 				while status[`j'] == "" &
> country[`i']==country[`j'] & date1[`i']==date1[`j'] {
> 					local sta = status[`i']
> 					replace status =
> "`sta'" in `j'
> 					local j = `j'+1
> 					local k = `k'+1
> 				}
> 			}
> 			local i = `i'+`k'
> 		}
> 	}

The spirit is willing, but the flesh is weak,
especially Friday evening (local time),
so I am not going to try and work out what's wrong
with this code. But broadly, I'll assert that
loops are only rarely necessary in data management.

There is guidance on similar problems in the
FAQ at
How can I replace missing values with previous
or following nonmissing values?
http://www.stata.com/support/faqs/data/missing.html

I'll probably misunderstand some details here,
but something like this should help.

Case (a) is covered by this, I think:

bysort country date1 (date) : replace status = status[_n-1]
	if mi(status) & status[_n-1] != "PC"

Case (b) is worse, but tractable. I think you need some preparation

gen negdate = -date
gen next = status
bysort country date1 (negdate) :
	replace next = status[_n-1] if mi(next)

bysort country date1 (date) :
	replace status = next
	if status == "PC" & (next == "W" | next == "NM")

by country date1 :
	replace status = "M" if status == "PC"

by country date1 : replace status = status[_n-1] if mi(status)

Note some features here:

* The constraint that things are done for the same
-country- and the same value of -date1- are enforced
within the framework of -by:-.

* Reversing time is the natural way of looking ahead,
given that Stata is better at looking behind, because
it knows what it has just done, but what it is going
to do next.

Again, do consider the FAQ.

Nick
[email protected]

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index