Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: mata function for "lookup" or find rank if observation not in the ranked sample

 From László Sándor To statalist@hsphsun2.harvard.edu Subject Re: st: mata function for "lookup" or find rank if observation not in the ranked sample Date Mon, 14 May 2012 22:31:02 -0400

```To follow-up on this:

I figured out a way with permutation functions, though it is messy
(FWIW, I paste it below).

More importantly, I was surprised to see that even in samples of 50K,
I suffered significant performance losses compared to minindexing over
entire N-vectors. I know minindex is built-in, pure C etc., but is it
reasonable that it can beat simple (well, simplish) lookup and some
logical checks, and a minindex over a much shorter vector?

I would very much appreciate some StataCorp comments on this. I
invested a bit into this new approach (more complicated from
cross-treatment matching) for coding (one-dimensional) matching, and
I'm surprised by the result.

Thanks,

Laszlo

***** Code with some explanation:
instead of simply using this to match treated observations close to
the treated in question:
minindex(abs(psc1:-psc1[i]),L,yki,ykw)

I do some work first out of the loop:
psc1order1 = order(psc1,1)
psc1invorder1 = invorder(psc1order1)

and then guess that the closest L (plus ties) might be in the 2L
closest observation below or above the rank of the current observation
in the ranking:
for ( lcl = 2*L; lcl<= 2*n1;lcl=2* lcl) {
if (lcl > n1) {
minindex(abs(psc1:-psc1[i]),L,yki,ykw)
break
}
else {
matchcandidateindices = psc1order1[|
max((psc1invorder1[i]-lcl,1)),1 \ min((psc1invorder1[i]+lcl,n1)),1|]
minindex(abs(psc1[matchcandidateindices]:-psc1[i]),L,yki,ykw)
if ( anyof(yki,1) | anyof(yki,rows(matchcandidateindices))) {
continue
}
else {
yki = matchcandidateindices[yki]
break
}
}
}

On Mon, May 14, 2012 at 11:04 AM, László Sándor <sandorl@gmail.com> wrote:
> Hi,
> I have another puzzle in Mata for Stata 10 and above. My previous
> thread on this list help me match, say, treated observations to
> treated observations, as I look up observations with propensity score
> close to the observation's in a (selected) vector that the observation
> itself is part of. The use of permutation vectors can be a dramatic
> improvement compared to many-many runs of -mindex- on the full
> propensity score vector.
>
> However, just as important would be to find observations with similar
> propensity scores in the subsample with opposite treatment. The
> observation itself is not part of that subsample, so I cannot simply
> look up its own rank there in a permutation vector I generate only
> once before I loop through all observations. Somehow I would need to
> know the rank of, say, a treated observation in the subsample of the
> control group. Actually, to really spare me costly runs on the entire
> control-prop.score-vector for each (treated) iteration, this lookup
> would better use some information I can generate for the whole
> population and the ranks there and/or in the two subsamples.
>
>
> Thanks!
>
> Laszlo
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```