Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <n.j.cox@durham.ac.uk> |

To |
"'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: Matching procedure based on shortest distance given latitudes and longitudes |

Date |
Thu, 9 Feb 2012 20:28:40 +0000 |

As the purpose of each -summarize- is to find the minimum, and no more, each could be done -, meanonly-. As the -summarize-s are repeated, the speed-up may be discernible. Nick n.j.cox@durham.ac.uk Robert Picard That works. A better way is to break out of the loop: forvalues i = 1/`nobs2006' { qui sum d scalar mind = r(min) if mi(mind) continue, break qui sum id1 if d == mind local bestid1 = r(min) qui sum id2 if d == mind local bestid2 = r(min) qui replace matchid = `bestid2' if id1 == `bestid1' qui replace matchd = mind if id1 == `bestid1' qui replace d = . if id1 == `bestid1' | id2 == `bestid2' dis "id1=" `bestid1' " matched " "id2=" `bestid2' " at d = " mind } 2012/2/9 Rüdiger Vollmeier <ruediger.vollmeier@googlemail.com>: > What about an additional if condition? > > ie. > > forvalues i = 1/`nobs2006' { > qui sum d > if r(N)!=0 { > scalar mind = r(min) > qui sum id1 if d == mind > local bestid1 = r(min) > qui sum id2 if d == mind > local bestid2 = r(min) > qui replace matchid = `bestid2' if id1 == `bestid1' > qui replace matchd = mind if id1 == `bestid1' > qui replace d = . if id1 == `bestid1' | id2 == `bestid2' > dis "id1=" `bestid1' " matched " "id2=" `bestid2' " at d = " mind > } > } > > Am 9. Februar 2012 20:37 schrieb Rüdiger Vollmeier > <ruediger.vollmeier@googlemail.com>: >> Thanks to Robert for this smart and elegant way of dealing with this problem. >> >> However, if there are less observations in 2010 than in 2006, >> matchid=1 is generated for all 2006 id=1 observations - even though >> there is no shortest distance associated with this id. >> >> What would be an equally elegant way of solving this problem? >> >> Ruediger. >> >> >> >> >> 2012/2/9 Robert Picard <picard@netbox.com>: >>> As I mentioned to you a few days ago, you do not need a special >>> program to find the nearest neighbors. You can simply use -cross- to >>> form all pairwise combination of 2006 and 2010 observations, compute >>> all the distances, and then sort. I've added some code that does, I >>> think, the matching you describe. >>> >>> Robert >>> >>> *----------- begin example ------------- >>> version 12 >>> >>> set seed 1234 >>> >>> * save 2010 observations separately >>> clear >>> set obs 10 >>> gen id2 = _n >>> gen lat2 = 40 + runiform() * 5 >>> gen lon2 = 19 + runiform() * 5 >>> tempfile y2010 >>> save "`y2010'" >>> >>> * create 7 obs for 2006 >>> clear >>> local nobs2006 7 >>> set obs `nobs2006' >>> gen id1 = _n >>> gen lat1 = 40 + runiform() * 5 >>> gen lon1 = 19 + runiform() * 5 >>> >>> * form all pairwise combinations and compute distance >>> cross using "`y2010'" >>> * user-written program, to install: ssc install geodist >>> geodist lat1 lon1 lat2 lon2, gen(d) >>> >>> >>> gen d0 = d >>> gen matchid = . >>> gen matchd = . >>> >>> forvalues i = 1/`nobs2006' { >>> qui sum d >>> scalar mind = r(min) >>> qui sum id1 if d == mind >>> local bestid1 = r(min) >>> qui sum id2 if d == mind >>> local bestid2 = r(min) >>> qui replace matchid = `bestid2' if id1 == `bestid1' >>> qui replace matchd = mind if id1 == `bestid1' >>> qui replace d = . if id1 == `bestid1' | id2 == `bestid2' >>> dis "id1=" `bestid1' " matched " "id2=" `bestid2' " at d = " mind >>> } >>> >>> sort id1 d0 id2 >>> >>> *------------ end example -------------- >>> >>> >>> >>> 2012/2/9 Rüdiger Vollmeier <ruediger.vollmeier@googlemail.com>: >>>> Hello guys, >>>> >>>> I want to match observations in each observation in a given year with >>>> one observation in another year based on the shortest geographical >>>> distance between them given the latitudes and longitudes of each >>>> observation. >>>> >>>> I.e. the simplified structure of the dataset looks as follows: >>>> >>>> id year longitude latitude >>>> 1 2006 19.923 40.794 >>>> 2 2006 19.949 40.711 >>>> 1 2010 19.940 40.721 >>>> 2 2010 22.001 50.122 >>>> >>>> Hence, I would like to match each observation in 2006 with the one >>>> observation in 2010 that is closest AND that had not been matched to >>>> any observation in 2006 before. >>>> >>>> The previously discussed -nearstat- command (thanks to Wilner!) cannot >>>> be applied directly to this problem as it could match the same >>>> observation in 2010 with multiple observations in 2006 (i.e. in this >>>> example, the year 2010 observation with id 1 is closest to both >>>> observations in 2006 - and hence would be matched). >>>> >>>> Does anybody have an idea for a nice solution or is there even a >>>> command out there that would match based on distance given the >>>> latitudes and longitudes? >>>> * >>>> * For searches and help try: >>>> * http://www.stata.com/help.cgi?search >>>> * http://www.stata.com/support/statalist/faq >>>> * http://www.ats.ucla.edu/stat/stata/ >>> >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/statalist/faq >>> * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Matching procedure based on shortest distance given latitudes and longitudes***From:*Rüdiger Vollmeier <ruediger.vollmeier@googlemail.com>

**References**:**st: Matching procedure based on shortest distance given latitudes and longitudes***From:*Rüdiger Vollmeier <ruediger.vollmeier@googlemail.com>

**Re: st: Matching procedure based on shortest distance given latitudes and longitudes***From:*Robert Picard <picard@netbox.com>

**Re: st: Matching procedure based on shortest distance given latitudes and longitudes***From:*Rüdiger Vollmeier <ruediger.vollmeier@googlemail.com>

**Re: st: Matching procedure based on shortest distance given latitudes and longitudes***From:*Rüdiger Vollmeier <ruediger.vollmeier@googlemail.com>

**Re: st: Matching procedure based on shortest distance given latitudes and longitudes***From:*Robert Picard <picard@netbox.com>

- Prev by Date:
**Re: st: Matching procedure based on shortest distance given latitudes and longitudes** - Next by Date:
**st: Merge problem; missing observation** - Previous by thread:
**Re: st: Matching procedure based on shortest distance given latitudes and longitudes** - Next by thread:
**Re: st: Matching procedure based on shortest distance given latitudes and longitudes** - Index(es):