Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Generate correlated binary data for fixed rho


From   "Georgia Ntani" <gn@mrc.soton.ac.uk>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   st: Generate correlated binary data for fixed rho
Date   Fri, 23 Nov 2012 17:56:32 +0000

Dear Statalisters

I am trying to generate binary correlated data for fixed parameter b
(effect of covariates on the outcome), number and size of clusters, and
ICC. For this, I follow the algorithm suggested by Santos et al (Santos
et al. Estimating adjusted prevalence ratio in clustered cross-sectional
epidemiological data. BMC Medical research methodology). Below, is what I
am trying to do in algorithm steps.

1. Set 10 clusters with 30 observations per cluster
2. Generate a dichotomous independent variable X1j. This will be a
group-level variable (X1j=1 for the first 5 clusters and 0 for the last 5
clusters)
3. Generate a continuous independent variable X2ij from a Normal(0,1)
distribution. That will be an individual level variable
4. Generate a normal variable u0, such that for given cluster j,
uoj~N(0,sigma_u^2), where uoj and uoj' are independent for j different
from j'. The intraclass correlation coefficient (ICC) is defined as
(sigma_u^2 )/(sigma_u^2 +(pi^2 /3))
5. Generate probability of the outcome that will be from a random effects
logistic model
Pij=exp(X*b+u0)/(1+exp(X*b+u0))
6. Generate a binary outcome Yij with probability Pij
The Stata commands that I used to run the above are
 
set obs 300
gen cluster=1 if _n<=30
forvalues i=2(1)10 {
replace cluster=`i' if _n<=`i'*30 & cluster==.
}
gen x1=1 if cluster<=5
replace x1=0 if x1==.
generate x2 = invnorm(uniform())

preserve
/* From the formula above,if ICC=0.03 then sigma_u will be ///
sqrt(0.01*_pi^2)/0.97 */
di sqrt(0.01*_pi^2)/0.97
set seed 54325821
clear all
corr2data u01 u02 u03 u04 u05 u06 u07 u08 u09 u010, n(30) ///
means(0 0 0 0 0 0 0 0 0 0) sds(.319 .319 .319 ///
.319 .319 .319 .319 .319 .319 .319)
gen sn=_n
reshape long u0 , i(sn) j(cluster)
drop sn
sort cluster u0
save randomerror.dta, replace
restore
 
sort cluster
merge cluster using randomerror
drop _merge
 
gen prob=exp(x1*0.47+x2*0.3665+u0)/(1+exp(x1*0.47+x2*0.3665+u0))
gen u=runiform()
gen y = (u > prob)
 
The problem now comes when I run the command - xtlogit y i.x1 x2,
i(cluster) quad(30) or-- What I get is the following

Random-effects logistic regression              Number of obs      =       300
Group variable: cluster                         Number of groups   =        10
 
Random effects u_i ~ Gaussian                   Obs per group: min =        30
                                                               avg =      30.0
                                                               max =        30
 
                                                Wald chi2(2)       =     14.95
Log likelihood  = -199.56666                    Prob > chi2        =    0.0006
 
------------------------------------------------------------------------------
           y |         OR   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        1.x1 |   .5943426   .1412417    -2.19   0.029      .373039    .9469334
          x2 |   .6737118    .081714    -3.26   0.001     .5311688    .8545073
-------------+----------------------------------------------------------------
    /lnsig2u |  -15.53268   280.1426                     -564.6021    533.5368
-------------+----------------------------------------------------------------
     sigma_u |   .0004238   .0593568                      2.5e-123    7.2e+115
         rho |   5.46e-08   .0000153                      1.9e-246           1
------------------------------------------------------------------------------
Likelihood-ratio test of rho=0: chibar2(01) =     0.00 Prob >= chibar2 = 1.000

My question is why I am getting a sigma_u of 0.0004? Aren't I supposed to
be getting sigma_u of 0.31 from the defined u0 (random error) above?

Any help on this would be highly appreciated
Many thanks
Georgia


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index