Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Estimating a model for whether pairs of subjects gave matching responses

From	"Lacy,Michael" <[email protected]>
To	"[email protected]" <[email protected]>
Subject	st: Estimating a model for whether pairs of subjects gave matching responses
Date	Thu, 25 Jul 2013 17:32:12 +0000

I'm trying to create a binary response model for a problem that I had approached 
originally through my own -optimize()- routine. I now am thinking that this
situation might be able to be framed so as to be estimated more easily with some 
existing kind of model, or perhaps even with -ml- rather than -optimize()-.
I'd like some advice on how to approach this with a tool other than
-optimize()-.  Perhaps something even easier is possible.

Here's an outline of the situation:

Data: Each of N subjects answers K questions, each with L response categories. 
The outcome of interest (Y = 0/1) is whether the responses of subject i and subject j match
on each question, for each distinct pair of subjects.  The data set thus consists of 
N*(N-1)/2 observations, with four variables on each observation:


Question#,   id_i,    id_j,   Y   // view in non-prop. font
1             1        2      1
 2             1        2      0
 ...
 K
 ...
 1             1        3      1
 ...
 K             1        3      1
 ...
 1            N-1       N      1
 ...
      

Model:  
Whether subject i and subject j gave the same response (Y =1) on any given
question is to be fit with s a particular theoretically stipulated model that depends 
on  the latent traits b_i and b_j.  The values of these variables for each subject
are the parameters to be estimated. Conditional on the latent trait, responses 
of subjects are assumed independent of one another, and responses of each subject 
to successive questions are independent.  The model to be estimated is
stipulated as:

Prob(i matches j on question k) =  (b_i * b_j * R)  + (1-R)

The latent trait b_i and b_j  represent probabilities, so that 
0 <= b_i, b_j <= 1.0.  R is a constant known a priori, with R = 1 -1/L
(The R arises from an assumption on a guessing component of
subjects' responses.)

==============


I'd gladly <grin> accept answers in either of the following directions:

1) Can someone suggest a way to fit this into an existing estimation
routine available in Stata, perhaps -nl-? (I have no experience
with that, as it happens.)

2) Can this be approached with a regular -ml- routine?  I can't see how to 
do that, since I can't see how to fit this problem into the framework of 
f(y) = xb, even though the log likelihood can be written as the sum of the
LL for each observations.

Regards,

Mike Lacy
Dept. of Sociology
Colorado State University
Fort Collins CO 80523-1784
970.491.6721 (voice)



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: Re: st: Materials available
Next by Date: Re: st: how to use subpop with a stratified sample
Previous by thread: st: ivregress with censored endogenous regressor
Next by thread: st: Mean of Std dev over moving window of 3 years‏
Index(es):
- Date
- Thread