Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# RE: st: calculating many distances and storing them in many new variables

 From Nick Cox To "'statalist@hsphsun2.harvard.edu'" Subject RE: st: calculating many distances and storing them in many new variables Date Fri, 21 Oct 2011 13:59:33 +0100

```That's my guess too in terms of "can", but not at all my guess in terms of "should".

You're implying that you want _hundreds_ more variables, but although that's not insuperable in terms of storage, it still raises the question of how you are going to work with those extra variables.

Here's an alternative approach:

. input patient_id str1 hosp_ch      lat_pat lng_pat lat_hosp        lng_hosp

patient~d    hosp_ch    lat_pat    lng_pat   lat_hosp   lng_hosp
1.  1               a               2               4               5      6
2.  2               a               1               3               5      6
3.  8               c               3               5               10     12
4.  9               a               5               2               5      6
5.  12              b               8               6               8      9
6.  end

. fillin patient_id hosp_ch

. bysort hosp_ch (lat_hosp) : replace lat_hosp = lat_hosp[1]

. by hosp_ch  : replace lng_hosp = lng_hosp[1]

. bysort hosp_ch (lat_pat) : replace lat_pat = lat_pat[1]

. bysort hosp_ch (lat_pat) : replace lng_pat = lng_pat[1]

. l

+------------------------------------------------------------------------+
| patien~d   hosp_ch   lat_pat   lng_pat   lat_hosp   lng_hosp   _fillin |
|------------------------------------------------------------------------|
1. |        2         a         1         3          5          6         0 |
2. |        1         a         1         3          5          6         0 |
3. |        9         a         1         3          5          6         0 |
4. |       12         a         1         3          5          6         1 |
5. |        8         a         1         3          5          6         1 |
|------------------------------------------------------------------------|
6. |       12         b         8         6          8          9         0 |
7. |        2         b         8         6          8          9         1 |
8. |        8         b         8         6          8          9         1 |
9. |        1         b         8         6          8          9         1 |
10. |        9         b         8         6          8          9         1 |
|------------------------------------------------------------------------|
11. |        8         c         3         5         10         12         0 |
12. |       12         c         3         5         10         12         1 |
13. |        2         c         3         5         10         12         1 |
14. |        1         c         3         5         10         12         1 |
15. |        9         c         3         5         10         12         1 |
+------------------------------------------------------------------------+

Now comparisons between patient and possible hospital are all between two variables.

Nick
n.j.cox@durham.ac.uk

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Vitorino, Maria Ana
Sent: 21 October 2011 11:16
To: <statalist@hsphsun2.harvard.edu>
Subject: Re: st: calculating many distances and storing them in many new variables

Dear Yuval,
Thanks. But the goal  is different. I know to which hospital each patient went. What I'm trying to do is to calculate the distance between each patient and each hospital alternative (ie all the hospitals that the patient could have decided to go to).
I'm guessing this can be done with some sort of loop...

Ana

>
> On Fri, Oct 21, 2011 at 5:13 AM, Maria Ana Vitorino
> <vitorino@wharton.upenn.edu> wrote:
>> Dear StataList users,
>>
>> Suppose we have the following toy data which has a list with patients, the
>> hospital to which they went (hosp_ch) and the coordinates for both.
>>
>> patient_id hosp_ch      lat_pat lng_pat lat_hosp        lng_hosp
>> 1               a               2               4               5
>>     6
>> 2               a               1               3               5
>>     6
>> 8               c               3               5               10
>>    12
>> 9               a               5               2               5
>>     6
>> 12              b               8               6               8
>>     9
>>
>>
>> What I would like to do is:
>> **To create new variables with the distances from every patient to every
>> possible hospital in the data. So, I would like 3 new columns which will
>> contain the distance from each patient to each hospital in the data.
>> **Also, I would like to have those new columns labeled dist_a, dist_b and
>> dist_c.
>>
>> Is there an efficient way to achieve this?
>> In the real data, I have many more patients and hospitals (hundreds in fact)
>> so I would like to generate these variables in an as automated way as
>> possible.
>>
>> Any help is appreciated.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```