Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Mata - extracting various vectors of different sizes in one loop

From	Nick Cox <[email protected]>
To	"[email protected]" <[email protected]>
Subject	Re: st: Mata - extracting various vectors of different sizes in one loop
Date	Thu, 4 Apr 2013 16:20:53 +0100

One possibility is that you just maintain two longish vectors in Mata,
one keeping tracking of some identifier and the other the distance
concerned. Naturally, you still need to set up those vectors. Then use
-reshape- in Stata to get the data structure you want (whether it's
the data structure you really need is not obvious).

A more specific suggestion that's part of the folklore is to observe
that taking a square root is not needed for _all_ Pythagorean distance
calculations. You can select on the _squared_ distance being less than
your distance threshold  squared and then (later) square root only the
distances you care about. That may be a twentieth-century trick that
would only speed up calculations trivially, but often people do this
with fairly large datasets, and you need to compare every place with
every other, so it's worth thinking about.

Nick
[email protected]


On 4 April 2013 11:26, nick bungy <[email protected]> wrote:
> I have a mata code that cycles through grid references (eastings, northings) of x entities and calculates for each entity all the other entities which are within a 10km radius of it.
> So each individual entity has a row vector, with dimensions anywhere between 1 row (1 firm within 10km radius) and ~80 rows (80 firms within 10km radius). This is throwing up conformity errors when I try to store these vectors into a selection of ~80 variables in Stata.
> My thought was to artifically inflate all row vectors to say 100 and fill all of the extra cells in each row vector with 0, then I can extract to 100 variables without conformity errors. I can then clean this up quite easily using Stata functions. I'm not quite sure how to go about this though.
> My mata code is the following:
>
> mata:
>          geoeasta = st_data(., "Geoeast")
>          geonortha = st_data(., "Geonorth")
>          n = rows(geoeasta)
>
>    density = .
>    densitytwo = .
>    densitythree = .
>    dups   = .
>
>
>          for(i=1; i<=n; ++i) {
>
>                  d = sqrt((geoeasta:-geoeasta[i]):^2 + (geonortha:-geonortha[i]):^2)
>                  d[i] = .
>                  density = select(d, d[.,1]:<10000)
>                  minindex(density, 80, densitytwo, dups)
>                  st_store(i, ("MSOA1", "MSOA2", "MSOA3" etc etc.), densitytwo)
>      //This stores the nearest neighbours into our variables, which we defined at the top.
>
>   }
>
>
>   end
> I suspect I need a line or two below minidex, which inflates densitytwo to a 100 row vector and fills all the extra rows generated with 0. Or perhaps there is a more elegent way?
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Mata - extracting various vectors of different sizes in one loop
  - From: nick bungy <[email protected]>

Prev by Date: st: Stata vs SAS Survey Risk Ratios
Next by Date: st: Bar graph with frequency
Previous by thread: RE: st: Mata - extracting various vectors of different sizes in one loop
Next by thread: st: How to maximize an approximated likelihood
Index(es):
- Date
- Thread