# Re: st: Nearest distance (spatial) and shp2dta question

 From Austin Nichols To statalist@hsphsun2.harvard.edu Subject Re: st: Nearest distance (spatial) and shp2dta question Date Thu, 2 Jul 2009 11:50:43 -0400

```David Torres<torresd@umich.edu>:
Each of (1) and (2) is easy to do (if time-consuming) with a single
loop over observations.

Note that the main trick in my code was to do
a nonmatched -merge- to get both datasets in memory at once.  You will
have to be a bit more clear about what you want to get more specific
advice: "calculate distances between census tract and county centroids
to the nearest [school] with population of the tracts or counties used
as weights" sounds like you want the final dataset to have one obs for
each institution with some kind of average distance to potential
students, but (2) says you want one obs per centroid.  What are these
calculations to be used for?  Maybe that answer will help clarify what
you need.

On (3), what data do you have?  Polygon vertices?  How big are the
polygons (i.e. is the curvature of the Earth important)?  Is this for
calculating centroids for use in (1) and (2)?

On Thu, Jul 2, 2009 at 10:42 AM, David Torres<torresd@umich.edu> wrote:
>
> 1.  I was just browsing the web looking for something similar to Cox's
> nearest .ado file and stumbled on the example Austin Nichols gave for
> calculating distances between two different sets of xy coordinates,
> originally from two different data sets.  I am not familiar with the code he
> gave so I have to ask:  Is there an .ado file that can do all that work for
> me?  I mean, if I already have two data sets that I've merged, is there a
> simple command I can input that will give me additional distance variables
> to work with?
>
> What I'm trying to do is calculate distances between census tract and county
> centroids (for the entire US, AK, and HI) to the nearest postsecondary
> institution (of all types and by sector and control of institution: public,
> private, proprietary, pub2year, priv2year, prop2year, pub4year, priv4year,
> prop4year), with population of the tracts or counties used as weights.
>
> 2.  I also would like to produce variables for the total number of
> institutions that fall within a 10, 20, 30, 40, 50, and 100 mile radius of
> each tract and county centroid.
>
> I can do all of this in ArcGIS, to be sure, but with eight years of data,
> and ten different .dbf/.shp files per year, this would be a tedious chore.
>  I would prefer to spend an hour and write a .do file that will do in
> minutes what it will take hours to do in ArcMap/ArcInfo.
>
> 3.  The shp2dta command produces xy coordinates for area centroids that I am
> not familiar with.  Does anyone know if the code in the .ado file can be
> changed to produce what I want--regular xy or lat/lon coordinates?
>
> With regard to the first two parts of my questions, here is Austin Nichols'
> code, the first part of which I don't really care about:
<snip> Any help anyone can offer is greatly appreciated.
>
> Thanks,
>
> David Torres

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```