Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: RE: forvalues within foreach?


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   RE: st: RE: forvalues within foreach?
Date   Tue, 11 May 2004 21:17:24 +0100

Overlapping records are clearly a problem 
for this code. It just adds up blindly. 

Let's copy our end dates to -svcdate_end2- 

. gen svcdate_end2 = svcdate 

Now we sort 

. sort enrolid svcdate svcdate_end2 

Our problems arise if the previous 
-svcdate_end2- is >= the current 
-svcdate-. Or, reversing this, 
if the next -svcdate- is <= 
the present -svcdate_end2-. 
That's what overlap is. 

However, we can get confused 
by intervals wholly within 
previous intervals, so 

. by enrolid : drop if svcdate >= svcdate[_n-1] & 
			svcdate_end2 <= svcdate_end2[_n-1] 

Then we go 

. by enrolid: replace svcdate_end2 = 
	svcdate[_n+1] - 1 
	if svcdate_end2 >= svcdate[_n+1]  

and then work with the modified 
copy. 

Note: I am not sure this copes
with all possible quirks. 

More important note: Somebody must 
have solved this problem before!

Assertion: This should be soluble 
without loops. 

Nick 

[email protected] 

> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]]On Behalf Of
> [email protected]
> Sent: 11 May 2004 20:38
> To: [email protected]
> Subject: Re: st: RE: forvalues within foreach?
> 
> 
> Thank you Nick.  The dataset I am working with, however, 
> contains non-disjoint records, can the approach you provided be 
> modified to address overlapping service dates?  When records do 
> overlap, I do not want to double-count.  
> --Clint Thompson 
> 
> 
> 
> On 11 May 2004 at 20:23, Nick Cox wrote:
> 
> > In program 2 you have several problems. 
> > 
> > The local macro 0 contains what you type 
> > after the program name, in your case 
> > a (probably unexpanded) varlist. Although 
> > the -syntax- statement will expand it, 
> > that doesn't affect `0'. So the first step is 
> > to go to 
> > 
> > program var_rep
> >  version 8
> >  syntax varlist(numeric)
> >  local n 10
> >  foreach var of local varlist {
> >    forvalues i = 14610(1)`n' {
> >     replace `var' = 1 if (`i' >= svcdate & `i' <= svcdate_end),
> >     by(enrolid)
> >    }
> >  }
> > end
> > 
> > But that still leaves two bugs that I can see: 
> > 
> > 1. 14610(1)10 won't go anywhere. You mean 14610/14619. 
> > 
> > 2. -replace- doesn't take a -by()- option. 
> > 
> > However, given your problem, a direct attack 
> > is possible, I believe, without any loops whatsover or 
> > indeed any programs whatsoever. 
> > 
> > Your structure appears to be 
> > 
> > enrolid svcdate svcdate_end 
> > 
> > First check that the dates are 
> > the right way round in every case 
> > 
> > . assert svcdate <= svcdate_end 
> > 
> > Possibly you even have several 
> > records for each person. That's no 
> > problem, so long as they are disjoint. 
> > 
> > For each person, you want #days in 
> > service between 1 Jan 2000 and 
> > 31 Dec 2001. The length of relevant service is 
> > 
> > min(svcdate_end, mdy(12,31,2001))  
> > -max(mdy(1,1,2000),svcdate) 
> > 
> > So I think what you want is 
> > 
> > gen cont = 1 + min(svcdate_end, mdy(12,31,2001)) -
> > max(mdy(1,1,2000),svcdate) egen sumservice = sum(cont), by(enrolid) 
> > 
> > Note the 1, based on the assumption that anyone who 
> > arrived and left the same day is regarded as serving 
> > 1 day, etc. Delete according to taste. 
> > 
> > Nick 
> > [email protected] 
> > 
> > [email protected]
> > 
> > > I have two small (and clumsy) programs wherein the objective is 
> > > to create a variable for each day over a two year time frame 
> > > (01Jan2000 - 31Dec2001) then assign the value 1 if the subject 
> > > was on service, as defined by two variables:  svcdate & 
> > > svcdate_end.  My programs are pasted below; the first one 
> > > (var_gen) generates the variables as expected (note that I 
> > > limited variable generation to just the first 10 days in 
> 2000).  The
> > > second program, however, executes when run but it does 
> not return a 1
> > > where it should.  I suspect that the problem may be with 
> the forvalues
> > > loop in the foreach statement.  Any advice or suggestions?  My
> > > ultimate objective is to sum the total number of days 
> each subject was
> > > on service over the two year period.   Thank you.  Clint 
> Thompson   
> > > 
> > > Program #1:
> > > program var_gen
> > > version 8
> > > local N 10
> > > forvalues i = 1(1)`N' {
> > > 	gen day`i' = 0
> > > 	}
> > > end
> > > 
> > > 
> > > Program #2:
> > > program var_rep
> > > version 8
> > > syntax varlist(numeric)
> > > local n 10
> > > foreach var of local 0 {
> > > 	forvalues i = 14610(1)`n' {
> > > 	replace `var' = 1 if (`i' >= svcdate & `i' <= 
> > > svcdate_end), by(enrolid)
> > > 	}
> > > }
> > > end
> > 
> > *
> > *   For searches and help try:
> > *   http://www.stata.com/support/faqs/res/findit.html
> > *   http://www.stata.com/support/statalist/faq
> > *   http://www.ats.ucla.edu/stat/stata/
> 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index