Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Calculating the shortest distances between observations (based on longitude and latitude)


From   Robert Picard <picard@netbox.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Calculating the shortest distances between observations (based on longitude and latitude)
Date   Thu, 2 Feb 2012 11:21:48 -0500

Take a look at -geonear- and -geodist-, both available from SSC. If
you have only two observation types, then the simplest approach is to
form all pairwise combinations of locations and then calculate the
distances.

*----------- begin example -------------
version 12

clear
input otype str10 country year lat lon
1 Albania 2010 42.07972 19.52361
1 Albania 2010 42.15028 19.66389
1 Albania 2010 42.01667 19.48333
2 Albania 2010 39.95 20.28333
2 Albania 2010 42.08417 20.42
end

* save type1 and type2 observation separately
tempfile main type2
save "`main'"
keep if otype == 2
rename * *2
gen id2 = _n
save "`type2'"
use "`main'"
keep if otype == 1
gen id1 = _n

* form all pairwise combinations and calculate distance
cross using "`type2'"
geodist lat lon lat2 lon2, gen(d)
sort id1 d
*------------ end example --------------

2012/2/2 Rüdiger Vollmeier <ruediger.vollmeier@googlemail.com>:
> Hello guys,
>
> I want to calculate the shortest distances between observations based
> on the coordinates (latitude, longitude). I have adapted a simple
> version from N. Cox's nearest neighbor search which was presented here
> some time ago. In contrast to that, I want to calulate not only the
> shortest but also the second shortest (third, and so on) distances.
>
> Here is a simplified structure of the dataset:
>
> observation_type        country year    latitude        longitude
> 1                               Albania 2010    42.07972        19.52361
> 1                               Albania 2010    42.15028        19.66389
> 1                               Albania 2010    42.01667        19.48333
> 2                               Albania 2010    39.95   20.28333
> 2                               Albania 2010    42.08417        20.42
>
> I want to calculate the smallest distances for a given observation of
> observation_type=1 to an observation of type=2 for a given year in a
> given country. Here is the code (all variables are generated of the
> form gen bank_1_dist_1 =.)
>
> * Shortest distance
> local n = _N
>                forval i = 1/`n' {
>                        forval j = 1/`n' {
>                        if  (`i' != `j') & (observation_type[`i']==1) &
> (observation_type[`j']==2) &
> (country_number[`i']==country_number[`j']) & (year[`i']==year[`j']) {
>                        local d  = (latitude[`i'] - latitude[`j'])^2 + (longitude[`i'] -
> longitude[`j'])^2
>                        replace bank_2010_1_`j'=`d' in `i'
>                        if `d' < bank_1_dist_1[`i'] {
>                                                replace bank_1_dist_1 = `d' in `i'
>                                                replace bank_1_id_1 = `j' in `i'
>                                        }
>                        }
>                }
>        }
> * Second shortest distance
> local n = _N
>                forval i = 1/`n' {
>                        forval j = 1/`n' {
>                        if  (`i' != `j') &(observation_type[`i']==1)
> &(observation_type[`j']==2)
> &(country_number[`i']==country_number[`j']) &(year[`i']==year[`j']) {
>                        local d2  = (latitude[`i'] - latitude[`j'])^2 + (longitude[`i'] -
> longitude[`j'])^2
>                        if (`d2' > bank_1_dist_1[`i']) & (`d2' < bank_1_dist_2[`i']) {
>                                                replace bank_1_dist_2 = `d2' in `i'
>                                                replace bank_1_id_2 = `j' in `i'
>                                        }
>                        }
>
>                }
>        }
>
> Here is the problem: The shortest distance seems to be well
> calculated. However, the second smallest distance is not calculated
> correctly (sometimes it takes on the same value as the shortest
> distance and only sometimes it is the actual shortest distance). Do
> you know why? Do you have any suggestions for improvement?
>
> Thanks in advance.
> Ruediger
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index