Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down at the end of May, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Rüdiger Vollmeier <ruediger.vollmeier@googlemail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Matching procedure based on shortest distance given latitudes and longitudes |

Date |
Thu, 9 Feb 2012 20:38:10 +0100 |

What about an additional if condition? ie. forvalues i = 1/`nobs2006' { qui sum d if r(N)!=0 { scalar mind = r(min) qui sum id1 if d == mind local bestid1 = r(min) qui sum id2 if d == mind local bestid2 = r(min) qui replace matchid = `bestid2' if id1 == `bestid1' qui replace matchd = mind if id1 == `bestid1' qui replace d = . if id1 == `bestid1' | id2 == `bestid2' dis "id1=" `bestid1' " matched " "id2=" `bestid2' " at d = " mind } } Am 9. Februar 2012 20:37 schrieb Rüdiger Vollmeier <ruediger.vollmeier@googlemail.com>: > Thanks to Robert for this smart and elegant way of dealing with this problem. > > However, if there are less observations in 2010 than in 2006, > matchid=1 is generated for all 2006 id=1 observations - even though > there is no shortest distance associated with this id. > > What would be an equally elegant way of solving this problem? > > Ruediger. > > > > > 2012/2/9 Robert Picard <picard@netbox.com>: >> As I mentioned to you a few days ago, you do not need a special >> program to find the nearest neighbors. You can simply use -cross- to >> form all pairwise combination of 2006 and 2010 observations, compute >> all the distances, and then sort. I've added some code that does, I >> think, the matching you describe. >> >> Robert >> >> *----------- begin example ------------- >> version 12 >> >> set seed 1234 >> >> * save 2010 observations separately >> clear >> set obs 10 >> gen id2 = _n >> gen lat2 = 40 + runiform() * 5 >> gen lon2 = 19 + runiform() * 5 >> tempfile y2010 >> save "`y2010'" >> >> * create 7 obs for 2006 >> clear >> local nobs2006 7 >> set obs `nobs2006' >> gen id1 = _n >> gen lat1 = 40 + runiform() * 5 >> gen lon1 = 19 + runiform() * 5 >> >> * form all pairwise combinations and compute distance >> cross using "`y2010'" >> * user-written program, to install: ssc install geodist >> geodist lat1 lon1 lat2 lon2, gen(d) >> >> >> gen d0 = d >> gen matchid = . >> gen matchd = . >> >> forvalues i = 1/`nobs2006' { >> qui sum d >> scalar mind = r(min) >> qui sum id1 if d == mind >> local bestid1 = r(min) >> qui sum id2 if d == mind >> local bestid2 = r(min) >> qui replace matchid = `bestid2' if id1 == `bestid1' >> qui replace matchd = mind if id1 == `bestid1' >> qui replace d = . if id1 == `bestid1' | id2 == `bestid2' >> dis "id1=" `bestid1' " matched " "id2=" `bestid2' " at d = " mind >> } >> >> sort id1 d0 id2 >> >> *------------ end example -------------- >> >> >> >> 2012/2/9 Rüdiger Vollmeier <ruediger.vollmeier@googlemail.com>: >>> Hello guys, >>> >>> I want to match observations in each observation in a given year with >>> one observation in another year based on the shortest geographical >>> distance between them given the latitudes and longitudes of each >>> observation. >>> >>> I.e. the simplified structure of the dataset looks as follows: >>> >>> id year longitude latitude >>> 1 2006 19.923 40.794 >>> 2 2006 19.949 40.711 >>> 1 2010 19.940 40.721 >>> 2 2010 22.001 50.122 >>> >>> Hence, I would like to match each observation in 2006 with the one >>> observation in 2010 that is closest AND that had not been matched to >>> any observation in 2006 before. >>> >>> The previously discussed -nearstat- command (thanks to Wilner!) cannot >>> be applied directly to this problem as it could match the same >>> observation in 2010 with multiple observations in 2006 (i.e. in this >>> example, the year 2010 observation with id 1 is closest to both >>> observations in 2006 - and hence would be matched). >>> >>> Does anybody have an idea for a nice solution or is there even a >>> command out there that would match based on distance given the >>> latitudes and longitudes? >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/statalist/faq >>> * http://www.ats.ucla.edu/stat/stata/ >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Matching procedure based on shortest distance given latitudes and longitudes***From:*Robert Picard <picard@netbox.com>

**References**:**st: Matching procedure based on shortest distance given latitudes and longitudes***From:*Rüdiger Vollmeier <ruediger.vollmeier@googlemail.com>

**Re: st: Matching procedure based on shortest distance given latitudes and longitudes***From:*Robert Picard <picard@netbox.com>

**Re: st: Matching procedure based on shortest distance given latitudes and longitudes***From:*Rüdiger Vollmeier <ruediger.vollmeier@googlemail.com>

- Prev by Date:
**Re: st: Matching procedure based on shortest distance given latitudes and longitudes** - Next by Date:
**Re: st: Matching procedure based on shortest distance given latitudes and longitudes** - Previous by thread:
**Re: st: Matching procedure based on shortest distance given latitudes and longitudes** - Next by thread:
**Re: st: Matching procedure based on shortest distance given latitudes and longitudes** - Index(es):