[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Compute a summary variable based on a predefined algorithm

From	"Nick Cox" <[email protected]>
To	<[email protected]>
Subject	st: RE: Compute a summary variable based on a predefined algorithm
Date	Sun, 18 Mar 2007 17:25:03 -0000

This problem is easier than you think in that no use of looping
(-foreach- etc.) is needed. It is difficult in that there are 
different possible reactions to missings on -v1-. This
post indicates one kind of solution. 

You have panel data. You could -tsset- it without loss: 

tsset childid day

That means that you could then use -tsspell- from SSC. 
Alternatively, you can work from first principles. 
I show the latter, but you might to look at -tsspell- too. 

On one definition, each episode of diarrhea (in English, 
diarrhoea) starts when v1 is 1 and the preceding value is not 1: 

bysort childid (day): gen first = v1 == 1 & v1[_n-1] != 1 

-first- is an indicator variable. You can use it to define
episodes: 

by childid : gen episodes = sum(first) 

_or_

by childid : gen episodes = cond(v1 == 0, 0, sum(first)) 

You can record the start dates of each episode: 

by childid : gen start = day if first 
by childid : replace start = start[_n-1] if !first 

The time since the previous start is then 

by childid : gen time_since = start - start[_n-1] if first 

and you are then interested in counting how many episodes 
are not within three days of the previous: 

by childid : egen n_episodes = total(first * (time_since >= 3)) 

The first episode is always included on this definition. 

Nick 
[email protected] 

Shuaib Kauchali
 
> I have data set of birth cohort data with longitudinal 
> follow-up of these  
> children till they were 9months old (270days), unless they 
> were lost to  
> follow up or died before then.
> 
> the data structure looks like this:
> Childid (repeated group variable, daily visit to the clinic)
> day (day of visit)
> v1 (diarrhea on that day of visit)
> v2 <--this is the variable I would like to get(defined as diarrhea  
> episodes: a string of 1's separated by at least 3 consecutive 
> 0's is an  
> episode)
> 
> 
> childid day v1  v2  
> 1   1   .   .       
> 1   2   .   .           
> 1   3   .   .   
> 1   4   .   .   
> 2   1   0   1
> 2   2   1   1
> 2   3   1   1
> 2   4   0   1
> 3   1   1   2
> 3   2   1   2
> 3   3   0   2
> 3   4   0   2
> 3   5   0   2
> 3   6   1   2
> 3   7   1   2
> 4   1   1   1
> 4   2   .   1
> 4   3   1   1
> 4   4   0   1
> 4   5   .   1
> 4   6   0   1
> 4   7   0   1
> 5   1   0   1
> 5   2   1   1
> 5   3   0   1
> 5   4   1   1
> 6   1   0   0
> 6   2   0   0
> 6   3   0   0
> 6   4   0   0
> 6   5   0   0
> 6   6   0   0
> 6   7   0   0
> 
> 
> Note:
> 1. childid=4 is a bit tricky because of missing values; we 
> assume the  
> episode to be one as there were not more than 3 days 
> separating 2 events.
> 2. childid=1 has not had any visits recorded, so he gets 
> missing values  
> for v2.
> 3. not everyone is followed-up for the same period: loss to 
> follow-up,  
> death, or completed the study (in my data set this should 
> happen when the  
> child reaches 270 days from birth. This is a birth cohort of 
> 2500 children)
> 
> My problem is I am unable to manipulate the data in Stata to 
> get me the  
> summary v2 of the number of episodes of diarrhea per child by 
> total number  
> of days observed. I am new to stata, but have been am a good 
> learner (I  
> have many of the stata press books to help). One way I came 
> across in the  
> books was to use explicit subscriptiing; this would allow me 
> to count the  
> total number of days followed per child; but I ma not sure 
> how to create  
> the alogorithm for the v2 creation--perhaps foreach, 
> forvalue, or even  
> while, local macro???. I find the commands a bit intimidating for a  
> newcomer, but am willing to spend time learning it.
> 
> Can anyone help?
> Best wishes
> 
> Shuaib
> 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: RE: Compute a summary variable based on a predefined algorithm
  - From: "Shuaib Kauchali" <[email protected]>
- Re: st: RE: Compute a summary variable based on a predefined algorithm
  - From: "Shuaib Kauchali" <[email protected]>

Prev by Date: st: RE: RE: Significance stars
Next by Date: st: -mim- available on SSC
Previous by thread: st: RE: RE: Significance stars
Next by thread: Re: st: RE: Compute a summary variable based on a predefined algorithm
Index(es):
- Date
- Thread