[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Austin Nichols <[email protected]> |

To |
[email protected] |

Subject |
Re: st: RE: exact command for distance ? |

Date |
Sat, 12 Sep 2009 07:48:07 -0400 |

Roy-- We do seem to be in some sort of twilight zone, a realm of asymmetric rules about evidence and civility, but I see no contradiction in what I have posted--I appreciate users posting code on SSC and elsewhere, and in Laura's problem an easy solution is available (via an unmatched merge and a loop over observations) without downloading any code, which is not to say that a solution using a downloadable program would not get there in fewer lines of code (or at least lines of code visible in an email). However, a similar approach to mine (unmatched merge and loop over observations) works for any type of problem of matching one dataset to another, with various different calculations done inside the loop. -vincenty- provides better accuracy, but at a cost (one has to download it, and it is slower than simpler calculations), though the accuracy may in fact be very important for some problems where several neighbors are a similar distance from a point and it is crucial to find the actual nearest neighbor (this is not an issue for Laura, who only wants minimum distance, and can tolerate a fairly large error). My main point about -distmatch- you do not seem to have answered: the help file makes a claim about its relative speed that seems unsupported by the evidence. I have not recommended that people not download it, but I maintain that the help file is inaccurate, and should be redacted. I also recommend you add some guidance for folks looking for a solution to Laura's problem, involving a second dataset, as the examples in the help file don't seem to be transparent to users as they stand, at least on how to approach the two-dataset problem. I maintain that the code below is a simple and elegant solution, using only built-in commands and one reasonably fast call to -merge- (the whole thing might be slightly faster in Mata, but at a cost of lost transparency). The code works just as well if the second file is a polygon file, in which case I would label the variable mindist "Distance to nearest body of water" without mentioning it is the nearest vertex of all polygons to which we are measuring distance; a suitably detailed polygon file will make the distance suitably accurate. use farms, clear local nf=_N g double mindist=. merge using waterbodies local R=6367.44 qui forv i=1/`nf' { local x1=farm_Y[`i'] local y1=farm_X[`i'] local x2 wat_Y local y2 wat_X g double L=(`y2'-`y1')*_pi/180 replace L=(`y2'-`y1'-360)*_pi/180 if L<. & L>_pi replace L=(`y2'-`y1'+360)*_pi/180 if L<-_pi local t1 acos(sin(`x2'*_pi/180)*sin(`x1'*_pi/180) g double d=`t1'+cos(`x2'*_pi/180)*cos(`x1'*_pi/180)*cos(L))*`R' su d, meanonly replace mindist=r(min) in `i' drop L d } drop _m waterbody_ID wat_X wat_Y On Fri, Sep 11, 2009 at 5:47 PM, Roy Wada <[email protected]> wrote: > Austin, > > Thanks for your feedback. You seem to be contradicting yourself > on occassions but some people do that now and then. > > If vincenty is critical, then why are you now recommending codes > not based on vincenty? You already know that vincenty makes no > important differences for the distance less than 100 miles. > > Please do make the calculations for us and tell us how this > will impact someone's research. > > I agree -distmatch- can be made to run faster (it should recycle > previous rankings) but not for the reason you posted. > > Your are forgeting to mention that your codes cannot perform ranking > or complete matching. It only looks for the minimum distance. > > This has been pointed out you before. > > I would post another comparison except for the fact that your codes > do not work for other matchings. > > You seem to be creating a moving target with ad hoc fixes, and > suggest other people do the same. If they can do this, why would > they need you? > > Are we stuck in the twilight zone where people does not need help > but in fact should be made to take one when offered. > > There is something funny about people who claim exlusive expertise. > > Let's agree it is a very bad idea to tell other people to not use > someone's else program. > > Roy > > P.S. You can take your download programs to the data center just > like other programss. Just put it in the current directory if you > still do not know how to do this. > > >> Roy-- >> I also have no problem downloading others' work, and my hard drive is >> cluttered with the output of Jann, Baum, Schaffer, Jenkins, Cox, and >> many others. I seem to use one of Ben Jann's programs every day. And >> one of my posted solutions on this topic requires downloading >> -vincenty- (from SSC), which gives much better distance estimates, >> though at a substantial time cost. >> >> I am not even claiming that -distmatch- has no utility--no doubt many >> will find it useful. But I'm afraid I don't see your point in this >> post at all--you claim in the help file that -distmatch- "take several >> minutes to complete" for 3000 obs, and other methods take "days if not >> weeks" yet the method that I have outlined in several posts is >> entirely general (i.e. it can be customized to produce any range of >> statistics for any range of neighbors, which no program can claim to >> do) and runs faster than -distmatch- in many cases, e.g. by a factor >> of four or five here: * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**RE: st: RE: exact command for distance ?***From:*Roy Wada <[email protected]>

**References**:**st: RE: exact command for distance ?***From:*Laura Platchkov <[email protected]>

**Re: st: RE: exact command for distance ?***From:*Austin Nichols <[email protected]>

**RE: st: RE: exact command for distance ?***From:*Roy Wada <[email protected]>

**Re: st: RE: exact command for distance ?***From:*Austin Nichols <[email protected]>

**RE: st: RE: exact command for distance ?***From:*Roy Wada <[email protected]>

- Prev by Date:
**Re: st: Difference between times** - Next by Date:
**st: ST: create different /distinct identifier for observations** - Previous by thread:
**RE: st: RE: exact command for distance ?** - Next by thread:
**RE: st: RE: exact command for distance ?** - Index(es):

© Copyright 1996–2024 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |