Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# st: RE: Matching Problem

 From Nick Cox <[email protected]> To "'[email protected]'" <[email protected]> Subject st: RE: Matching Problem Date Sun, 17 Oct 2010 16:32:55 +0100

```I support Stefan's approach here. Note that (e.g.)

gen match = 0
replace match = 1 if int(test/5) == int(train/5)

is just

gen match = int(test/5) == int(train/5)

However, there is a small problem with either if there are missing values.

gen match = int(test/5) == int(train/5) if !missing(test, train)

is better than either.

Nick
[email protected]

[email protected]

try this:

*** Example Dataset
clear
set obs 1000
gen test = int(uniform()*200)
gen train = int(uniform()*200)
gen match = 0

*** You're not clear about start and end of the intervals you want: 1, 6, 11 vs. 70, 125
*** So choose the command that fits.

** Intervals [0..4] [5..9] [10..14]... [195-199] [200]
replace match = 1 if int(test/5) == int(train/5)

** Intervals [0] [1..5] [6..10] [11..15]... [196..200]
replace match = 1 if int((test+4)/5) == int((train+4)/5)

Raphael Fraser

I have two variables which I will call test and training each
containing integers between 1-200. Each number in the test variable
represents an image. The training variable contains the closest image
to the test image in terms of similarity.

test = (129, 163, 71, 176, 125, ...)
train = (128, 162, 71, 119, 123, ...)

The objective is to match both integers to the same interval. These
are the intervals 1-5, 6-10,11-15, ..., 196-200. For example, when
test=129 and train=128 are both in the same interval 125-130. Also
test=71,train=71 are both in the interval 70-75. These are successful
mappings. I would like a successful mapping to be =1 and failure=0.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```

• References: