 Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# st: Estimating a model for whether pairs of subjects gave matching responses

 From "Lacy,Michael" To "statalist@hsphsun2.harvard.edu" Subject st: Estimating a model for whether pairs of subjects gave matching responses Date Thu, 25 Jul 2013 17:32:12 +0000

```I'm trying to create a binary response model for a problem that I had approached
originally through my own -optimize()- routine. I now am thinking that this
situation might be able to be framed so as to be estimated more easily with some
existing kind of model, or perhaps even with -ml- rather than -optimize()-.
I'd like some advice on how to approach this with a tool other than
-optimize()-.  Perhaps something even easier is possible.

Here's an outline of the situation:

Data: Each of N subjects answers K questions, each with L response categories.
The outcome of interest (Y = 0/1) is whether the responses of subject i and subject j match
on each question, for each distinct pair of subjects.  The data set thus consists of
N*(N-1)/2 observations, with four variables on each observation:

Question#,   id_i,    id_j,   Y   // view in non-prop. font
1             1        2      1
2             1        2      0
...
K
...
1             1        3      1
...
K             1        3      1
...
1            N-1       N      1
...

Model:
Whether subject i and subject j gave the same response (Y =1) on any given
question is to be fit with s a particular theoretically stipulated model that depends
on  the latent traits b_i and b_j.  The values of these variables for each subject
are the parameters to be estimated. Conditional on the latent trait, responses
of subjects are assumed independent of one another, and responses of each subject
to successive questions are independent.  The model to be estimated is
stipulated as:

Prob(i matches j on question k) =  (b_i * b_j * R)  + (1-R)

The latent trait b_i and b_j  represent probabilities, so that
0 <= b_i, b_j <= 1.0.  R is a constant known a priori, with R = 1 -1/L
(The R arises from an assumption on a guessing component of
subjects' responses.)

==============

1) Can someone suggest a way to fit this into an existing estimation
routine available in Stata, perhaps -nl-? (I have no experience
with that, as it happens.)

2) Can this be approached with a regular -ml- routine?  I can't see how to
do that, since I can't see how to fit this problem into the framework of
f(y) = xb, even though the log likelihood can be written as the sum of the
LL for each observations.

Regards,

Mike Lacy
Dept. of Sociology
Fort Collins CO 80523-1784
970.491.6721 (voice)

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```