Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Aggregating spatial units for SPMAP

 From Robert Picard To statalist@hsphsun2.harvard.edu Subject Re: st: Aggregating spatial units for SPMAP Date Sun, 16 Dec 2012 15:33:47 -0800

```I think I may have a solution. The following program takes a
shapefile and creates new polygons that have the shape of the
outer borders of the adjacent polygons. It builds on the fact
that adjacent polygons have common points along line segments
that are shared. The outer borders are composed of polylines
(connected line segments) that start and end with a shared point
with all in-between points unique to a single polygon. It's a bit
tricky to connect all the outer border polylines, but minimal
testing suggests that the program works.

Let me know if the program works or not in your case. If no
problem is reported, I'll add bells and whistles and write up a
help file and post it to SSC.

Hope this helps, Robert

* --------------------------------------------------------------
version 12
clear all

*! 1.0.0 16dec2012 Robert Picard, picard@netbox.com
program define mergepoly

args id0 lat0 lon0

rename `lat0' lat
rename `lon0' lon

gen obs = _n

// create new id in case there are multipart polygons
gen id = sum(mi(lat,lon))
sort id obs
qui by id: drop if _n == 1

// each polygon should start and end at the same lat/lon
by id: assert lat[1] == lat[_N]
by id: assert lon[1] == lon[_N]

// adjacent polygons have common points
sort lat lon obs
by lat lon: gen N = _N

// identify polylines that form the outside borders;
// they start and end at common points and have a
sort id obs
by id: gen pstart = N > 1 & N[_n+1] == 1
by id: gen pend = N > 1 & N[_n-1] == 1
by id: gen inpoly = sum(pstart - pend)
qui replace inpoly = 1 if pend
gen polyline = sum(pstart) * inpoly

// link polylines by grouping coordinates and
// identifying the next polyline
sort lat lon pend pstart obs
by lat lon: egen pp = total(pstart + pend)
assert pp <= 2
by lat lon: assert pend if pp == 2 & _n == _N
by lat lon: assert pstart if pp == 2 & _n == _N - 1
qui by lat lon: gen next = polyline[_n-1] if _n == _N & pp == 2

// handle special case of line segments on the outer border;
// these are not polylines, they just have 2 points and they
// create a bridge between two polylines.
sort lat lon pend obs
by lat lon: gen sameid = id == id[_N]
sort lat lon pend sameid obs
qui by lat lon: gen link = obs[1]+1 if pend & pp == 1
sort lat lon pstart obs
qui by lat lon: replace link = obs[1] if pstart & pp == 1
qui by pp link: replace next = polyline[2] if pp == 1 & pend
qui by pp link: assert _N == 2 if pp == 1 & !mi(link)

qui keep if polyline
sort polyline obs
by polyline: gen one = _n == 1

// some polylines may be polygons already
by polyline: gen ispolyline = lat[1] != lat[_N] | lon[1] != lon[_N]
sort ispolyline polyline obs
qui gen newpolygon = sum(one) if !ispolyline

// loop over each polyline in order and build new polygons that
sum newpolygon, meanonly
local curpolygon = cond(mi(r(max)), 1, r(max) + 1)
sum polyline if mi(newpolygon), meanonly
local nextpl = r(min)
local more = !mi(r(N))
local startpline `nextpl'

qui gen pline_order = .

local i 0
while `more' {

local i = `i' + 1
qui replace pline_order = `i' if polyline == `nextpl'
qui replace newpolygon = `curpolygon' if polyline == `nextpl'

sum next if polyline == `nextpl', meanonly
if `nextpl' == r(max) local startpline = r(max)
local nextpl = r(max)

if `nextpl' == `startpline' {
local curpolygon = `curpolygon' + 1
sum polyline if mi(newpolygon), meanonly
local nextpl = r(min)
local startpline `nextpl'
}

qui count if mi(newpolygon)
local more = r(N)
}

sort newpolygon pline_order obs
qui expand 2 if one
sort newpolygon pline_order obs
qui by newpolygon: replace lat = . if _n == 1
qui by newpolygon: replace lon = . if _n == 1

rename newpolygon ext`id0'
keep `id0' lat lon ext`id0'
rename lat `lat0'
rename lon `lon0'

end

use "http://fmwww.bc.edu/repec/bocode/i/Italy-RegionsCoordinates.dta";, clear

mergepoly _ID _Y _X

line _Y _X, lwidth(vvthin) lcolor(blue) cmissing(n) ///
aspectratio(1.2) xscale(off) yscale(off) ///
ylabel(minmax, nogrid)  xlabel(minmax)

* --------------------------------------------------------------

On Fri, Dec 14, 2012 at 6:11 AM, Karsten Pfaff
<atze1985@zedat.fu-berlin.de> wrote:
> Dear Statalist members,
> I would like to refer first to the following thread that emerged some
> month earlier on statalist:
>
> http://www.stata.com/statalist/archive/2012-06/msg00384.html
>
> My problem is similar. I have a shapefile (coordinate file) containing
> polygon coordinates about a certain type of regions (calles "kreise") in
> Germany, which allows me to draw a grid of Germany that divides Germany in
> just these regions. However, I am not interested in the regions themselves
> but in some kind of higher order regions called "ror". Of course I also
> have information about which "kreise" are part of which "ror" (just
> imagine you have coordinates about the counties of the USA and based on
> these coordinates you would like to get the coordinates of the federal
> states, knowing which countries belong to which federal state). The data
> file looks as follows (illustrative example):
>
> ror    kreise (=_ID)     _Y     _X
> 101        1         9.0468785  54.872552
> 101        1         9.0646343  54.871183
> 101        1         9.0649463  54.871159
> 101        1         9.090514   54.70260
> 101        2
> 101        2
> 101        2      some coordinates for kreise 2 & 3
> 101        2
> 101        2
> 101        2
> 102        3
> 102        3
>
> So obviously, ror 101 contains kreise 1 and 2.
> Based on these information, I would like to create a shapefile that draws
> polygones of each ror, and ignores the "within" borders. Is there any
> possibility to do so? I think there must be some way since all the
> necessary information to create a map containing but the rors are at hand.
> Any help or suggestion is highly appreciated.
>
> Best regards Karsten Pfaff
>
>
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```