Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | "Lacy,Michael" <Michael.Lacy@colostate.edu> |
To | "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |
Subject | st: Estimating a model for whether pairs of subjects gave matching responses |
Date | Thu, 25 Jul 2013 17:32:12 +0000 |
I'm trying to create a binary response model for a problem that I had approached originally through my own -optimize()- routine. I now am thinking that this situation might be able to be framed so as to be estimated more easily with some existing kind of model, or perhaps even with -ml- rather than -optimize()-. I'd like some advice on how to approach this with a tool other than -optimize()-. Perhaps something even easier is possible. Here's an outline of the situation: Data: Each of N subjects answers K questions, each with L response categories. The outcome of interest (Y = 0/1) is whether the responses of subject i and subject j match on each question, for each distinct pair of subjects. The data set thus consists of N*(N-1)/2 observations, with four variables on each observation: Question#, id_i, id_j, Y // view in non-prop. font 1 1 2 1 2 1 2 0 ... K ... 1 1 3 1 ... K 1 3 1 ... 1 N-1 N 1 ... Model: Whether subject i and subject j gave the same response (Y =1) on any given question is to be fit with s a particular theoretically stipulated model that depends on the latent traits b_i and b_j. The values of these variables for each subject are the parameters to be estimated. Conditional on the latent trait, responses of subjects are assumed independent of one another, and responses of each subject to successive questions are independent. The model to be estimated is stipulated as: Prob(i matches j on question k) = (b_i * b_j * R) + (1-R) The latent trait b_i and b_j represent probabilities, so that 0 <= b_i, b_j <= 1.0. R is a constant known a priori, with R = 1 -1/L (The R arises from an assumption on a guessing component of subjects' responses.) ============== I'd gladly <grin> accept answers in either of the following directions: 1) Can someone suggest a way to fit this into an existing estimation routine available in Stata, perhaps -nl-? (I have no experience with that, as it happens.) 2) Can this be approached with a regular -ml- routine? I can't see how to do that, since I can't see how to fit this problem into the framework of f(y) = xb, even though the log likelihood can be written as the sum of the LL for each observations. Regards, Mike Lacy Dept. of Sociology Colorado State University Fort Collins CO 80523-1784 970.491.6721 (voice) * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/