I am trying to generate correlated binary data to model random effects with
xtlogit. I plan to do several Monte Carlo type studies. My question is
about the specific form of the data to generate. I have two predictors, x1
(within cluster predictor) and z1 (between cluster predictor), and the
outcome will be binary. The model takes this form:
logit(pij) = B0j + B1*x1ij + B2*z1ij
B0j = b0 + u0j
so
logit(pij) = b0 + B1*x1ij + B2*z1ij + u0j
where there are i observations within cluster j.
The issue I have is whether there should be a within-cluster error term
added to the model when generating data. That is, should the model to
generate binary data appear like this (the .3 appear in the equation
because I copy and pasted from code I used, see below)
generate y1_w_error =.3 + .3*x1 +.3*z1 + u0j + e
or like this
generate y1_wo_error =.3 + .3*x1 + .3*z1 + u0j ?
With linear models I know the "e" should be included, and in fact my code
below generates data for continuous outcomes that behave well, but I'm
uncertain about the logistic case.
My code is listed below.
set obs 30
generate float z1= round(uniform())
generate float u0j= invnorm(uniform())*1.69
generate n = 30
range id2 1 30
expand n
generate float x1= round(uniform())
generate float e= invnorm(uniform())
* Option 1 -- model with level 1 error
generate y1_w_error =.3 + .3*x1 +.3*z1 + u0j + e
generate p_w_error = exp(y1_w_error)/(1+exp(y1_w_error))
generate binary_y_w_error = uniform()<=p_w_error
* Option 2 -- model without level 1 error
generate y1_wo_error =.3 + .3*x1 + .3*z1 + u0j
generate p_wo_error = exp(y1_wo_error)/(1+exp(y1_wo_error))
generate binary_y_wo_error = uniform()<=p_wo_error
___________________________________________________________________
Bryan W. Griffin
Curriculum, Foundations, & Reading
P.O. Box 8144
Georgia Southern University
Statesboro, GA 30460-8144