Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Rüdiger Vollmeier <ruediger.vollmeier@googlemail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Matching procedure based on shortest distance given latitudes and longitudes |

Date |
Thu, 9 Feb 2012 21:56:25 +0100 |

Thanks for the replies. 2012/2/9 Nick Cox <n.j.cox@durham.ac.uk>: > As the purpose of each -summarize- is to find the minimum, and no more, each could be done -, meanonly-. As the -summarize-s are repeated, the speed-up may be discernible. > > Nick > n.j.cox@durham.ac.uk > > Robert Picard > > That works. A better way is to break out of the loop: > > forvalues i = 1/`nobs2006' { > qui sum d > scalar mind = r(min) > if mi(mind) continue, break > qui sum id1 if d == mind > local bestid1 = r(min) > qui sum id2 if d == mind > local bestid2 = r(min) > qui replace matchid = `bestid2' if id1 == `bestid1' > qui replace matchd = mind if id1 == `bestid1' > qui replace d = . if id1 == `bestid1' | id2 == `bestid2' > dis "id1=" `bestid1' " matched " "id2=" `bestid2' " at d = " mind > } > > 2012/2/9 Rüdiger Vollmeier <ruediger.vollmeier@googlemail.com>: >> What about an additional if condition? >> >> ie. >> >> forvalues i = 1/`nobs2006' { >> qui sum d >> if r(N)!=0 { >> scalar mind = r(min) >> qui sum id1 if d == mind >> local bestid1 = r(min) >> qui sum id2 if d == mind >> local bestid2 = r(min) >> qui replace matchid = `bestid2' if id1 == `bestid1' >> qui replace matchd = mind if id1 == `bestid1' >> qui replace d = . if id1 == `bestid1' | id2 == `bestid2' >> dis "id1=" `bestid1' " matched " "id2=" `bestid2' " at d = " mind >> } >> } >> >> Am 9. Februar 2012 20:37 schrieb Rüdiger Vollmeier >> <ruediger.vollmeier@googlemail.com>: >>> Thanks to Robert for this smart and elegant way of dealing with this problem. >>> >>> However, if there are less observations in 2010 than in 2006, >>> matchid=1 is generated for all 2006 id=1 observations - even though >>> there is no shortest distance associated with this id. >>> >>> What would be an equally elegant way of solving this problem? >>> >>> Ruediger. >>> >>> >>> >>> >>> 2012/2/9 Robert Picard <picard@netbox.com>: >>>> As I mentioned to you a few days ago, you do not need a special >>>> program to find the nearest neighbors. You can simply use -cross- to >>>> form all pairwise combination of 2006 and 2010 observations, compute >>>> all the distances, and then sort. I've added some code that does, I >>>> think, the matching you describe. >>>> >>>> Robert >>>> >>>> *----------- begin example ------------- >>>> version 12 >>>> >>>> set seed 1234 >>>> >>>> * save 2010 observations separately >>>> clear >>>> set obs 10 >>>> gen id2 = _n >>>> gen lat2 = 40 + runiform() * 5 >>>> gen lon2 = 19 + runiform() * 5 >>>> tempfile y2010 >>>> save "`y2010'" >>>> >>>> * create 7 obs for 2006 >>>> clear >>>> local nobs2006 7 >>>> set obs `nobs2006' >>>> gen id1 = _n >>>> gen lat1 = 40 + runiform() * 5 >>>> gen lon1 = 19 + runiform() * 5 >>>> >>>> * form all pairwise combinations and compute distance >>>> cross using "`y2010'" >>>> * user-written program, to install: ssc install geodist >>>> geodist lat1 lon1 lat2 lon2, gen(d) >>>> >>>> >>>> gen d0 = d >>>> gen matchid = . >>>> gen matchd = . >>>> >>>> forvalues i = 1/`nobs2006' { >>>> qui sum d >>>> scalar mind = r(min) >>>> qui sum id1 if d == mind >>>> local bestid1 = r(min) >>>> qui sum id2 if d == mind >>>> local bestid2 = r(min) >>>> qui replace matchid = `bestid2' if id1 == `bestid1' >>>> qui replace matchd = mind if id1 == `bestid1' >>>> qui replace d = . if id1 == `bestid1' | id2 == `bestid2' >>>> dis "id1=" `bestid1' " matched " "id2=" `bestid2' " at d = " mind >>>> } >>>> >>>> sort id1 d0 id2 >>>> >>>> *------------ end example -------------- >>>> >>>> >>>> >>>> 2012/2/9 Rüdiger Vollmeier <ruediger.vollmeier@googlemail.com>: >>>>> Hello guys, >>>>> >>>>> I want to match observations in each observation in a given year with >>>>> one observation in another year based on the shortest geographical >>>>> distance between them given the latitudes and longitudes of each >>>>> observation. >>>>> >>>>> I.e. the simplified structure of the dataset looks as follows: >>>>> >>>>> id year longitude latitude >>>>> 1 2006 19.923 40.794 >>>>> 2 2006 19.949 40.711 >>>>> 1 2010 19.940 40.721 >>>>> 2 2010 22.001 50.122 >>>>> >>>>> Hence, I would like to match each observation in 2006 with the one >>>>> observation in 2010 that is closest AND that had not been matched to >>>>> any observation in 2006 before. >>>>> >>>>> The previously discussed -nearstat- command (thanks to Wilner!) cannot >>>>> be applied directly to this problem as it could match the same >>>>> observation in 2010 with multiple observations in 2006 (i.e. in this >>>>> example, the year 2010 observation with id 1 is closest to both >>>>> observations in 2006 - and hence would be matched). >>>>> >>>>> Does anybody have an idea for a nice solution or is there even a >>>>> command out there that would match based on distance given the >>>>> latitudes and longitudes? >>>>> * >>>>> * For searches and help try: >>>>> * http://www.stata.com/help.cgi?search >>>>> * http://www.stata.com/support/statalist/faq >>>>> * http://www.ats.ucla.edu/stat/stata/ >>>> >>>> * >>>> * For searches and help try: >>>> * http://www.stata.com/help.cgi?search >>>> * http://www.stata.com/support/statalist/faq >>>> * http://www.ats.ucla.edu/stat/stata/ >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Matching procedure based on shortest distance given latitudes and longitudes***From:*Rüdiger Vollmeier <ruediger.vollmeier@googlemail.com>

**Re: st: Matching procedure based on shortest distance given latitudes and longitudes***From:*Robert Picard <picard@netbox.com>

**Re: st: Matching procedure based on shortest distance given latitudes and longitudes***From:*Rüdiger Vollmeier <ruediger.vollmeier@googlemail.com>

**Re: st: Matching procedure based on shortest distance given latitudes and longitudes***From:*Rüdiger Vollmeier <ruediger.vollmeier@googlemail.com>

**Re: st: Matching procedure based on shortest distance given latitudes and longitudes***From:*Robert Picard <picard@netbox.com>

**RE: st: Matching procedure based on shortest distance given latitudes and longitudes***From:*Nick Cox <n.j.cox@durham.ac.uk>

- Prev by Date:
**st: Merge problem; missing observation** - Next by Date:
**st: RE: Merge problem; missing observation** - Previous by thread:
**RE: st: Matching procedure based on shortest distance given latitudes and longitudes** - Next by thread:
**st: state space model estimation in STATA** - Index(es):