Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.

# st: RE: RE: RE: RE: RE: looping to value of a variable

 From Nick Cox To "'statalist@hsphsun2.harvard.edu'" Subject st: RE: RE: RE: RE: RE: looping to value of a variable Date Thu, 23 Feb 2012 17:58:17 +0000

```I see, I think.

gen flag = 0

forval j = 1/8 {
replace flag = 1 if missing(DFU`j') & flag == 0 & DFU`j' <= maxDFU
}

Nick
n.j.cox@durham.ac.uk

Richard Fox

Sorry for the confusion.

I want just one flag that tells me if each record (row) has a missing value for the DFU variables. This would be simple were it not for the fact that for certain rows I only want to assess a subset of the variables for missing values. As per the example data I only want to assess DFU1-DFU(maxFU) for missingness.

If I could use the value of maxFU as above DFU1-DFU(maxFU) then I could simply use

egen rowmiss(DFU1-DFU(maxFU))

but I don't believe that's possible.

If I use egen = rowmiss(DFU1-DFU9) then for the 1st row I'd get 6 whereas I want just 1. For id 3 I'd expect flag ==0.

If I don't loop over records I believe stata will overwrite all flags (all rows) as 1 as soon as it finds any missing value.

After further thought this could be performed with a simple formula. Nonetheless I'm still interested to see how to loop to a variable value. I see that Mata may be a solution and will explore this in more detail. This is something that's easily performed in SAS but I appreciate that stata thinks in the opposite direction.

Not sure if it helps but I'm cleaning data for an oncology study. So for id (patient) 1 there should be 3 follow-up (fu) form each having a date of completion dfu (date follow up).

id 	DFU1		DFU2		DFU3		DFU4		DFU5		DFU6		DFU7		DFU8		maxFU
1	30/10/1910			08/02/1904											3
2	16/12/1908	24/01/1913			08/02/1904									4
3	04/09/1907	13/10/1911	21/11/1915	30/12/1919	07/02/1924	17/03/1928	25/04/1932			7
4	18/10/1914			08/02/1904	18/03/1908	26/04/1912	04/06/1916	13/07/1920	21/08/1924	8

I managed to get my code working, perhaps this may illustrate what I'm trying to do;

/* identify rows with missing dates */
gen flag=0
count
local N=r(N)
forvalues i = 1/`N' {

/* sp holds the max number of follow-ups visits for the particular patient (row) */
local sp = maxFU[`i']
forvalues j=1/`sp'	{
replace flag=1 if DFU`j'==. & _n==`i'
}
}

Nick Cox

Sorry, but I am still unclear on what flags you want.

The fact that -maxFU- exists seems to be a red herring. You can create flags by

forval j = 1/8 {
gen ismissing`j' = missing(dFU`j')
}

Or, if you want it the other way round, negate the function call with -!missing()-

But why do you need the flags at all?

Even if I am misunderstanding you, which is quite likely, the small bit of Stata technique may be some help.

Nick
n.j.cox@durham.ac.uk

Richard Fox

Hi Nick,

Yes you're correct, sorry for the confusion over DFU and FU. I added the egen function to illustrate where the loop count values could come from. In fact the values came from reshaping long data.

I want to flag missing dates, however, for each record I need to assess only to a certain point. These are missing follow-up forms in a medical scenario - if patients are only followed for a certain time then I can't record some forms as missing if the patient has reached that time-point.

Take the example below; for the 1st id I only want to loop to 3 to test for missing values. In the second id I only want to loop to 4, and so on. I suppose I could just only increment a counter if `i' <= maxFU. Just to note that the code within the loops (replace flag.....) was incomplete in my previous message - it was really just the form of the loop statements that I was interested in.

id 	dfu1		dfu2		dfu3		dfu4		dfu5		dfu6		dfu7		dfu8		maxFU
1	30/10/1910			08/02/1904											3
2	16/12/1908	24/01/1913			08/02/1904									4
3	04/09/1907	13/10/1911	21/11/1915	30/12/1919	07/02/1924	17/03/1928	25/04/1932			7
4	18/10/1914			08/02/1904	18/03/1908	26/04/1912	04/06/1916	13/07/1920	21/08/1924	8

I'll have a look at the reference.

Nick Cox

Your example is not very clear. You have FU* and by implication DFU*. Do you want to flag missings or non-missings? I can read your post either way.

However, you (almost surely) do not need to loop over observations. It is sufficient to loop over variables.

See a review in this territory

SJ-9-1  pr0046  . . . . . . . . . . . . . . . . . . .  Speaking Stata: Rowwise
(help rowsort, rowranks if installed) . . . . . . . . . . .  N. J. Cox
Q1/09   SJ 9(1):137--157
shows how to exploit functions, egen functions, and Mata
for working rowwise; rowsort and rowranks are introduced

Nick
n.j.cox@durham.ac.uk

Richard Fox
I want to loop to the value of a variable. Let's say I have generated the number of non-missing values in a row of data (maxFU in example below). I want to loop to that value which clearly can differ between records.

The following does the job but feels like cheating.

egen maxFU = rownonmissing(FU1 FU2 FU3 FU4 FU5 )

count
local N=r(N)
forvalues i = 1/`N' {
local sp = maxFU[`i']
forvalues j=1/`sp'	{
qui replace flag`j'=1 if DFU`j'==.
}
}

There must be a simpler way; any ideas?

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```