# Re: st: Simulating multilevel data in Stata

 From "James Shaw" To statalist@hsphsun2.harvard.edu Subject Re: st: Simulating multilevel data in Stata Date Tue, 23 Dec 2008 16:10:13 -0600

```Stas and Alan:

After my last e-mail, I modified Alan's initial code to produce the
following, which seems to work.  The resulting variable that
represents the unit-specific effects is time-invariant.

***
drop _all
set obs 500

gen n = _n
gen u=invnorm(uniform())
expand 3
sort n
gen n2 = _n
gen t= (n2 - (n-1)*3)

matrix V = I(3)
matrix V[1,2]=.5
matrix V[2,1]=.5
matrix V[1,3]=.0
matrix V[3,1]=.0

drawnorm X1 X2 e,cov(V)

matrix V = I(3)
matrix V[1,2]=.3
matrix V[2,1]=.3
matrix V[1,3]=.0
matrix V[3,1]=.0
matrix V[2,3]=.5
matrix V[3,2]=.5

matrix accum A = u X1 X2 ,noc dev
matrix A=(1/1499)*A

matrix H=cholesky(V)
matrix G=cholesky(A)
matrix GI=inv(G)
matrix HGI=H*GI

gen y3=HGI[1,1]*u
gen y1=HGI[2,1]*u+HGI[2,2]*X1
gen y2=HGI[3,1]*u+HGI[3,2]*X1+HGI[3,3]*X2

--
Jim

On Tue, Dec 23, 2008 at 3:28 PM, Stas Kolenikov <skolenik@gmail.com> wrote:
> Here's what I would suggest.
>
> X1, X2 and u have a multivariate normal distribution, say (unless you
> want to study the effects of non-normality, as well). Let us write the
> covariance matrix as Sigma, and block it as Sigma_11, Sigma_12 and
> Sigma_22 so that the upper block corresponds to the time varying X
> variables, and the lower block, to the panel effect u:
>
> mata:
> Sigma = (1, 0.5, 0.3 \ 0.5, 1, 0 \ 0.3, 0, 1)
> Sigma11 = Sigma[|(1,1) \ (2,2)|]
> Sigma21 = Sigma[|(3,1) \ (3,2)|]
> Sigma22 = Sigma[3,3]
>
> // 1. Sample u from say standard normal distribution:
> u = rnormal(1,1,0,1)
>
> // 2. Figure out the conditional distribution X1, X2 | u: the mean is
> mean_u = invsym(Sigma11) * Sigma21' * u
> Sigma11_u = Sigma11 - Sigma21' * Sigma21 / Sigma22
> chS = cholesky(Sigma11_u)
>
> // 3. Draw normal X1 and X2
> X = rnormal(1,2,0,1)
> X = X*chS
>
> // I hope I did not mess up with all the transposes and such :))
>
> // Now, this will generate a triplet of u, X1, X2.
> // You can repeat step 3 a bunch of times to get T values,
> // and that's doable simply by replacing the lines to
> X = rnormal(T,2,0,1)
> X = X*chS
>
> // You will also need to propagate the values of u:
> U = u*J(3,1,1)
>
> // Finally you can add (standard normal) e on top of everything:
> E = rnormal(3,1,0,1)
>
> // Now, a simulated panel will look like
> X, U, E
>
> end
> // of Mata
>
> You now will have to repeat the process N times, and post everything
> back to Stata using st_view(). Most of this stuff is of course doable
> in strict Stata, as well. The critical steps are matrix blocking and
> Cholesky decomposition. You can just do those for your fixed
> covariance matrix of interest, and hard code the numbers into some
> -generate- and -replace- statements.
>
> On Tue, Dec 23, 2008 at 10:55 AM, James Shaw <shawjw@gmail.com> wrote:
>> Dear Statalist members,
>>
>> I want to perform a simulation to show the inconsistency of the OLS
>> and random effects estimators when one of the regressors is correlated
>> with the unit-specific error component.  The specifics of the
>> simulation are as follows;
>>
>> Y[i,t] (the outcome to be modeled) = b0 + b1*X1[i,t] + b2*X2[i,t] +
>> u[i] + e[i,t]
>>
>> i = 1,...,500 indexes subjects
>> t = 1,...,3 indexes time (repeated observations on subjects)
>>
>> X1 and X2 are normally distributed random variables with arbitrary
>> means and variances
>> u is a normally distributed subject-specific error component with mean
>> of 0 and arbitrary variance
>> e is a normally distributed random error component with mean of 0 and
>> arbitrary variance
>>
>> corr(X1,X2) = 0.5
>> corr(X1,u) = 0.3
>> corr(X2,u) = 0.0
>> corr(X1,e) = corr(X2,e) = corr(u,e) = 0.0
>>
>> b0, b1, and b2 are parameters to be specified in the simulation
>>
>>
>> I have been unable to identify a method that will ensure that
>> corr(X1,u) equals the desired value.  I tried the following method in
>> which u was generated separately from X1 and X2 and cholesky
>> decomposition was applied to generate transformations of the three
>> random variables that would exhibit the desired correlations.
>> However, this yielded a non-zero correlation between X2 and u.
>>
>> Method 1
>> ***
>> drop _all
>> set obs 500
>> gen n = _n
>> gen u=invnorm(uniform())
>> expand 3
>> sort n
>> gen n2 = _n
>> gen t= (n2 - (n-1)*3)
>> drawnorm x1 x2 e
>> sort n
>> mkmat x1 x2 u e, matrix(X)
>> mat c =(1, .5, .3, 0 \ .5, 1, 0, 0 \ .3, 0, 1, 0 \ 0, 0, 0, 1)
>> mat X2 = X*cholesky(c)
>> ***
>>
>> A method that yielded somewhat better results involved generating X1,
>> X2, u, and e with a pre-specified correlation matrix and then
>> collapsing u so that it varied by subject only.  This provided the
>> correct values for corr(X1,X2) and corr(X2,u) but attenuated the
>> correlation between X1 and u.  I presume that I could simply specify a
>> higher value for corr(X1,u) when generating the variables so that the
>> desired value would be achieved after u is collapsed.  However, this
>> would not be the most elegant solution.
>>
>> Method 2
>> ***
>> drop _all
>> set obs 500
>> mat c =(1, .5, .3, 0 \ .5, 1, 0, 0 \ .3, 0, 1, 0 \ 0, 0, 0, 1)
>> gen n = _n
>> expand 3
>> sort n
>> gen n2 = _n
>> gen t= (n2 - (n-1)*3)
>> drawnorm x1 x2 u e, corr(c)
>> sort n
>> by n: egen u2 = mean(u)
>> ***
>>
>> Any suggestions or references would be appreciated.
>>
>> Regards,
>>
>> Jim
>>
>>
>> James W. Shaw, Ph.D., Pharm.D., M.P.H.
>> Assistant Professor
>> College of Pharmacy
>> University of Illinois at Chicago
>> 833 South Wood Street, M/C 871, Room 252
>> Chicago, IL 60612
>> Tel.: 312-355-5666
>> Fax: 312-996-0868
>> Mobile Tel.: 215-852-3045
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>
>
>
> --
> Stas Kolenikov, also found at http://stas.kolenikov.name
> Small print: I use this email account for mailing lists only.
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

--
James W. Shaw, Ph.D., Pharm.D., M.P.H.
Assistant Professor
College of Pharmacy
University of Illinois at Chicago
833 South Wood Street, M/C 871, Room 252
Chicago, IL 60612
Tel.: 312-355-5666
Fax: 312-996-0868
Mobile Tel.: 215-852-3045
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```