Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.

# Re: st: Mata - extracting various vectors of different sizes in one loop

 From Nick Cox To "statalist@hsphsun2.harvard.edu" Subject Re: st: Mata - extracting various vectors of different sizes in one loop Date Thu, 4 Apr 2013 16:20:53 +0100

```One possibility is that you just maintain two longish vectors in Mata,
one keeping tracking of some identifier and the other the distance
concerned. Naturally, you still need to set up those vectors. Then use
-reshape- in Stata to get the data structure you want (whether it's
the data structure you really need is not obvious).

A more specific suggestion that's part of the folklore is to observe
that taking a square root is not needed for _all_ Pythagorean distance
calculations. You can select on the _squared_ distance being less than
your distance threshold  squared and then (later) square root only the
distances you care about. That may be a twentieth-century trick that
would only speed up calculations trivially, but often people do this
with fairly large datasets, and you need to compare every place with
every other, so it's worth thinking about.

Nick
njcoxstata@gmail.com

On 4 April 2013 11:26, nick bungy <nickbungystata@hotmail.co.uk> wrote:
> I have a mata code that cycles through grid references (eastings, northings) of x entities and calculates for each entity all the other entities which are within a 10km radius of it.
> So each individual entity has a row vector, with dimensions anywhere between 1 row (1 firm within 10km radius) and ~80 rows (80 firms within 10km radius). This is throwing up conformity errors when I try to store these vectors into a selection of ~80 variables in Stata.
> My thought was to artifically inflate all row vectors to say 100 and fill all of the extra cells in each row vector with 0, then I can extract to 100 variables without conformity errors. I can then clean this up quite easily using Stata functions. I'm not quite sure how to go about this though.
> My mata code is the following:
>
> mata:
>          geoeasta = st_data(., "Geoeast")
>          geonortha = st_data(., "Geonorth")
>          n = rows(geoeasta)
>
>    density = .
>    densitytwo = .
>    densitythree = .
>    dups   = .
>
>
>          for(i=1; i<=n; ++i) {
>
>                  d = sqrt((geoeasta:-geoeasta[i]):^2 + (geonortha:-geonortha[i]):^2)
>                  d[i] = .
>                  density = select(d, d[.,1]:<10000)
>                  minindex(density, 80, densitytwo, dups)
>                  st_store(i, ("MSOA1", "MSOA2", "MSOA3" etc etc.), densitytwo)
>      //This stores the nearest neighbours into our variables, which we defined at the top.
>
>   }
>
>
>   end
> I suspect I need a line or two below minidex, which inflates densitytwo to a 100 row vector and fills all the extra rows generated with 0. Or perhaps there is a more elegent way?
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```