Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: AW: RE: Problem looping over spells for an individual


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   RE: st: AW: RE: Problem looping over spells for an individual
Date   Thu, 26 Feb 2009 12:20:22 -0000

Thanks for this, which is good news for me because it explains why the code I was seeing looked as it did. 

In terms of moving forward, I have a few vague suggestions. 

0. Spells. See the suggestions on reading and software in the thread started by Jakob Petersen yesterday. 

<http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/statalist.0902/date/article-1122.html> 

1. One is more of style or taste than technique. I prefer to think in terms of tagging observations I want to keep or work on with 1 and those I don't with 0. Then you can do almost anything later 

	... if tag 

or 

	... if !tag 

as the case may be. 

An advantage of that style: it is reversible, both within an algorithm and generally. 
(If you really want to -drop- observations, -drop- them in one go when the selection is final.) 

2. One strategy might be 

loop over individuals { 
	-expand- each individual to a block of observations with one observation per day 
	<magic bit> 
	reduce each individual back again 
} 

3. This problem reminds me loosely of one tackled with -panelthin- on SSC. The code for that may suggest some technique. 

Nick 
[email protected] 

Ilona Carneiro

Many thanks to Nick & Martin for pointing out my error using "if" -  
you are correct and that's why it wasn't working. However, I'm still  
unable to do what I wanted to. Apologies for posting code which I  
tried to simplify, but just made incomprehensible! The snippet was  
part of a much larger programme in which the other local macros are  
all defined.

I'll try to clarify. Here is an example of the problem I have. These  
are consecutive periods of observations for an individual - the end  
denoted by a clinic visit which may or may not be defined as a case  
(depending on diagnostic result), or by exit from the study.

id		start		end		case	tx
1		10		20		1		1
1		20		35		1		0
1		35		50		1		0
1		50		100		.		.

I need to exclude 19 days at risk if the patient received treatment  
(tx==1) as this is considered to be prophylaxis, and to avoid counting  
the same episode (case==1) twice I  also exclude 19 days at risk after  
a case is diagnosed. However, as the latter is only to prevent double- 
counting it is not necessary if the case has already been disqualified.

What I need to get is the following:

id		start		end		case	tx
1		10		20		1		1
1		40		50		1		0
1		50		100		.		.

I originally coded the following VERY crudely:

/* To calculate the gaps  */
sort id start
by id: gen lagend = end + lag if (tx > 0 & tx < .) | (case > 0 & case  
< .) & _n!=_N


/* To drop periods of time that are disqualified - repeated 3 times as  
there may be up to 3 consecutively - to be generalisable, it could be  
more */
sort id start
by id: drop if lagend[_n-1] > end & lagend[_n-1] < . & _n!=1
sort id start
by id: drop if lagend[_n-1] > end & lagend[_n-1] < . & _n!=1
sort id start
by id: drop if lagend[_n-1] > end & lagend[_n-1] < . & _n!=1
sort id start
by id: drop if lagstart > end & lagstart < . & _n!=1

/* To update the start date */
sort id start
by id: replace start = lagend[_n-1] if lagend[_n-1] < . & _n!=1	
sort id start
by id: drop if (end < start | start[_n-1] > end) & end < . & start < .  
& _n!=1

This works fine for adding a gap after each treatment, as I need to do  
this even if the observation period is dropped from the time at risk.  
The code gave the following result, as both the 2nd & 3rd episodes  
were disqualified, instead of just the 2nd:

id		start		end		case	tx
1		10		20		1		1
1		55		100		.		.

I realise that I need to evaluate the generation of the gap after  
cases separately for each observation period, incase the observation  
is dropped. But can't seem to find a way to do this. I hope this is a  
clearer explanation of the problem.

On another point, I subsequently use stgen gap =  gaplen() to  
calculate how much time to exclude from the time at risk. Stata  
appears to count one more than just the actual gap, i.e. it will give  
me a gap of 20 days between an observation ending with day 20, and a  
subsequent observation starting at day 40, when the actual time  
excluded in-between is 19 days. I'm just subtracting 1 from the  
calculation at present, but is there a reason for this?

Ilona



On 25 Feb 2009, at 18:27, Martin Weiss wrote:

>
> <>
>
>
> I was desperate to find an SJ tip for Ilona on the difference  
> between "if"
> and "if"; turns out it is an FAQ:
> http://www.stata.com/support/faqs/lang/ifqualifier.html
>
>
>
>
> HTH
> Martin
>
>
> -----Ursprüngliche Nachricht-----
> Von: [email protected]
> [mailto:[email protected]] Im Auftrag von Nick Cox
> Gesendet: Mittwoch, 25. Februar 2009 18:22
> An: [email protected]
> Betreff: st: RE: Problem looping over spells for an individual
>
> Unless you are working under the aegis of -by:- _N will always be
> interpreted as the total number of observations. This code doesn't
> satisfy that.
>
> I echo Martin Weiss in suspecting that your -if `touse'- is a bug. You
> are almost certainly confusing the two flavours of -if-.
>
> Otherwise, your code still looks very confused and based on a  
> variety of
> misunderstandings. Apart from `touse', which is defined by - 
> marksample-,
> all of the local macros you refer to will be treated as empty strings,
> as none has been defined earlier in the program. I am surprised to  
> hear
> that it is running at all.
>
> It does not look as if you need a program anyway. My impression is  
> that
> all you need is to use -by:- but I don't understand your problem well
> enough to suggest better code. Someone else may be able to give better
> help. If not, rather than a lengthy word description, you should  
> perhaps
> give an example of your data with the intended result.
>
> Nick
> [email protected]
>
> Ilona Carneiro
>
> I am trying to write a programme that will run a command sequentially
> for observations of an individual. For each individual I have multiple
> spells and multiple failures. However, the twist is that I also need
> to exclude a period of time at risk after each treatment (prophylaxis)
> and after each failure (to prevent double-counting of failures that
> may actually be the same episode). I managed to do this without any
> problem for the treatment, but if an episode is disqualified (by a
> prior treatment or episode) I don't want it to disqualify a subsequent
> episode. Therefore I need to run the code sequentially for each spell
> of an individual, but using the marksample touse code to run it "by"
> individual doesn't seem to be working - the "forvalues" seems to
> always interpret _N as the last observation in the whole dataset, not
> the last observation for each individual.
>
> I have the following code:
>
> 		program define byid, byable(recall, noheader)
> 		marksample touse
> 		sort `id' `start'
> 		if `touse' {
> 		forvalues i = 1(1)`=_N' {
> 		replace lagend = (`end' + `lag') if ((`tx' > 0 & `tx' <
> .) | (`case'
>> 0 & `case' < .))
> 		drop if lagend[`i'-1]>`end' & `id'[`i'-1]==`id'
> 		}
> 		}
> 		end
> 		
> 		gen lagend=. 	
> 		qui by id: byid
>
> but I get the error:
> 2nd by group not found
> r(111);
>
> And the programme isn't doing what I need it to.
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index