Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Generate correlated binary data for fixed rho

From	"Georgia Ntani" <[email protected]>
To	"[email protected]" <[email protected]>
Subject	st: Generate correlated binary data for fixed rho
Date	Fri, 23 Nov 2012 17:56:32 +0000

Dear Statalisters

I am trying to generate binary correlated data for fixed parameter b
(effect of covariates on the outcome), number and size of clusters, and
ICC. For this, I follow the algorithm suggested by Santos et al (Santos
et al. Estimating adjusted prevalence ratio in clustered cross-sectional
epidemiological data. BMC Medical research methodology). Below, is what I
am trying to do in algorithm steps.

1. Set 10 clusters with 30 observations per cluster
2. Generate a dichotomous independent variable X1j. This will be a
group-level variable (X1j=1 for the first 5 clusters and 0 for the last 5
clusters)
3. Generate a continuous independent variable X2ij from a Normal(0,1)
distribution. That will be an individual level variable
4. Generate a normal variable u0, such that for given cluster j,
uoj~N(0,sigma_u^2), where uoj and uoj' are independent for j different
from j'. The intraclass correlation coefficient (ICC) is defined as
(sigma_u^2 )/(sigma_u^2 +(pi^2 /3))
5. Generate probability of the outcome that will be from a random effects
logistic model
Pij=exp(X*b+u0)/(1+exp(X*b+u0))
6. Generate a binary outcome Yij with probability Pij
The Stata commands that I used to run the above are
 
set obs 300
gen cluster=1 if _n<=30
forvalues i=2(1)10 {
replace cluster=`i' if _n<=`i'*30 & cluster==.
}
gen x1=1 if cluster<=5
replace x1=0 if x1==.
generate x2 = invnorm(uniform())

preserve
/* From the formula above,if ICC=0.03 then sigma_u will be ///
sqrt(0.01*_pi^2)/0.97 */
di sqrt(0.01*_pi^2)/0.97
set seed 54325821
clear all
corr2data u01 u02 u03 u04 u05 u06 u07 u08 u09 u010, n(30) ///
means(0 0 0 0 0 0 0 0 0 0) sds(.319 .319 .319 ///
.319 .319 .319 .319 .319 .319 .319)
gen sn=_n
reshape long u0 , i(sn) j(cluster)
drop sn
sort cluster u0
save randomerror.dta, replace
restore
 
sort cluster
merge cluster using randomerror
drop _merge
 
gen prob=exp(x1*0.47+x2*0.3665+u0)/(1+exp(x1*0.47+x2*0.3665+u0))
gen u=runiform()
gen y = (u > prob)
 
The problem now comes when I run the command - xtlogit y i.x1 x2,
i(cluster) quad(30) or-- What I get is the following

Random-effects logistic regression              Number of obs      =       300
Group variable: cluster                         Number of groups   =        10
 
Random effects u_i ~ Gaussian                   Obs per group: min =        30
                                                               avg =      30.0
                                                               max =        30
 
                                                Wald chi2(2)       =     14.95
Log likelihood  = -199.56666                    Prob > chi2        =    0.0006
 
------------------------------------------------------------------------------
           y |         OR   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        1.x1 |   .5943426   .1412417    -2.19   0.029      .373039    .9469334
          x2 |   .6737118    .081714    -3.26   0.001     .5311688    .8545073
-------------+----------------------------------------------------------------
    /lnsig2u |  -15.53268   280.1426                     -564.6021    533.5368
-------------+----------------------------------------------------------------
     sigma_u |   .0004238   .0593568                      2.5e-123    7.2e+115
         rho |   5.46e-08   .0000153                      1.9e-246           1
------------------------------------------------------------------------------
Likelihood-ratio test of rho=0: chibar2(01) =     0.00 Prob >= chibar2 = 1.000

My question is why I am getting a sigma_u of 0.0004? Aren't I supposed to
be getting sigma_u of 0.31 from the defined u0 (random error) above?

Any help on this would be highly appreciated
Many thanks
Georgia


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: Re: st: Recoding all variables in a dataset
Next by Date: Re: st: Recoding all variables in a dataset
Previous by thread: st: Recoding all variables in a dataset
Next by thread: st: looping over a variable list of variables
Index(es):
- Date
- Thread