Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Creating a contact tracing file (levelsof & nested foreach loops)

From   "Puddicombe, David" <>
To   "''" <>
Subject   st: Creating a contact tracing file (levelsof & nested foreach loops)
Date   Fri, 28 Sep 2012 08:40:37 -0700


I would like to create a file that indicates which cases of an infectious disease outbreak may have been the source of infection for later cases.

The variables in my dataset are
id_n (non-identifying id, values 1 to 83)
date_onset (%td date of onset of disease variable)
city (city where each case resides, numeric variable)

Cases occurred in seven cities:

  City      |      Freq.     Percent        Cum.
     city_1 |         46       55.42       55.42
     city_2 |          2        2.41       57.83
     city_3 |         12       14.46       72.29
     city_4 |          3        3.61       75.90
     city_5 |         18       21.69       97.59
     city_7 |          1        1.20       98.80
     city_8 |          1        1.20      100.00
      Total |         83      100.00

Within each city, I would like to create indicator variables (0 No 1 Yes) to describe which of the subsequent cases may have been infected by each case and to generate variables with the id number of each potential subsequent case.
The rules to define whether a case might be the source of infection for a subsequent case are:
they are in the same city; and
7<=(date_onset[case2] - date_onset[case1])<=18

For example, case 1 in city_1 could be the source of infection for the other 45 cases in city_1.  I would like to create two variables for each of these 45 potential contacts(potential_contact_n_yn; potential_contact_n_id). For case 2 in city_1, there are 44 potential subsequent cases etc.

I've tried several levelsof and foreach loops but cannot make my code work.  My aim is to create a wide dataset describing the relationships between cases within each city and then transpose the file from wide to long with one row for each potential source of infection to use in a network analysis program (cytoscape).

Any suggestions/help would be greatly appreciated.  Some sample data are below.

id_n date_onset city
1    09mar2010  city_1
2    10mar2010  city_1
3    11mar2010  city_1
4    18mar2010  city_1
5    18mar2010  city_1
6    18mar2010  city_1
7    18mar2010  city_1
8    19mar2010  city_1
9    19mar2010  city_1
10   20mar2010  city_1
11   21mar2010  city_1
12   21mar2010  city_1
13   22mar2010  city_1
14   22mar2010  city_1
15   22mar2010  city_1
16   22mar2010  city_1
17   22mar2010  city_2
18   24mar2010  city_1
19   25mar2010  city_3
20   25mar2010  city_7
21   25mar2010  city_1
22   25mar2010  city_1
23   26mar2010  city_1
24   26mar2010  city_1
25   26mar2010  city_3
26   27mar2010  city_3
27   28mar2010  city_5
28   28mar2010  city_3
29   29mar2010  city_1
30   29mar2010  city_1

Best wishes,

David Puddicombe
Communicable Disease Epidemiologist
Tel 604 707 2537
Fax 604 707 2515
Immunization Programs and Vaccine Preventable Diseases Service
BC Centre for Disease Control
655 West 12th Avenue, Vancouver, BC Canada V5Z 4R4

*   For searches and help try:

© Copyright 1996–2016 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index