Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: Compute a summary variable based on a predefined algorithm


From   "Shuaib Kauchali" <sk041123@aol.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: RE: Compute a summary variable based on a predefined algorithm
Date   Sun, 18 Mar 2007 19:41:29 +0200

Thank you Nick for taking time to respond. I will try the suggested solutions and get back if I run into problems.

Regards

Shuaib

On Sun, 18 Mar 2007 19:25:03 +0200, Nick Cox <n.j.cox@durham.ac.uk> wrote:


This problem is easier than you think in that no use of looping
(-foreach- etc.) is needed. It is difficult in that there are
different possible reactions to missings on -v1-. This
post indicates one kind of solution.

You have panel data. You could -tsset- it without loss:

tsset childid day

That means that you could then use -tsspell- from SSC.
Alternatively, you can work from first principles.
I show the latter, but you might to look at -tsspell- too.

On one definition, each episode of diarrhea (in English,
diarrhoea) starts when v1 is 1 and the preceding value is not 1:

bysort childid (day): gen first = v1 == 1 & v1[_n-1] != 1

-first- is an indicator variable. You can use it to define
episodes:

by childid : gen episodes = sum(first)

_or_

by childid : gen episodes = cond(v1 == 0, 0, sum(first))

You can record the start dates of each episode:

by childid : gen start = day if first
by childid : replace start = start[_n-1] if !first

The time since the previous start is then

by childid : gen time_since = start - start[_n-1] if first

and you are then interested in counting how many episodes
are not within three days of the previous:

by childid : egen n_episodes = total(first * (time_since >= 3))

The first episode is always included on this definition.

Nick
n.j.cox@durham.ac.uk

Shuaib Kauchali

I have data set of birth cohort data with longitudinal
follow-up of these
children till they were 9months old (270days), unless they
were lost to
follow up or died before then.

the data structure looks like this:
Childid (repeated group variable, daily visit to the clinic)
day (day of visit)
v1 (diarrhea on that day of visit)
v2 <--this is the variable I would like to get(defined as diarrhea
episodes: a string of 1's separated by at least 3 consecutive
0's is an
episode)


childid day v1  v2
1   1   .   .
1   2   .   .
1   3   .   .
1   4   .   .
2   1   0   1
2   2   1   1
2   3   1   1
2   4   0   1
3   1   1   2
3   2   1   2
3   3   0   2
3   4   0   2
3   5   0   2
3   6   1   2
3   7   1   2
4   1   1   1
4   2   .   1
4   3   1   1
4   4   0   1
4   5   .   1
4   6   0   1
4   7   0   1
5   1   0   1
5   2   1   1
5   3   0   1
5   4   1   1
6   1   0   0
6   2   0   0
6   3   0   0
6   4   0   0
6   5   0   0
6   6   0   0
6   7   0   0


Note:
1. childid=4 is a bit tricky because of missing values; we
assume the
episode to be one as there were not more than 3 days
separating 2 events.
2. childid=1 has not had any visits recorded, so he gets
missing values
for v2.
3. not everyone is followed-up for the same period: loss to
follow-up,
death, or completed the study (in my data set this should
happen when the
child reaches 270 days from birth. This is a birth cohort of
2500 children)

My problem is I am unable to manipulate the data in Stata to
get me the
summary v2 of the number of episodes of diarrhea per child by
total number
of days observed. I am new to stata, but have been am a good
learner (I
have many of the stata press books to help). One way I came
across in the
books was to use explicit subscriptiing; this would allow me
to count the
total number of days followed per child; but I ma not sure
how to create
the alogorithm for the v2 creation--perhaps foreach,
forvalue, or even
while, local macro???. I find the commands a bit intimidating for a
newcomer, but am willing to spend time learning it.

Can anyone help?
Best wishes

Shuaib


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


--
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index