# st: RE: sample code in -heckprob-

 From "Scott Merryman" To Subject st: RE: sample code in -heckprob- Date Mon, 5 Sep 2005 12:22:02 -0500

```> -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-
> statalist@hsphsun2.harvard.edu] On Behalf Of Katsuhide Isa
> Sent: Tuesday, August 30, 2005 7:57 AM
> To: Statalist
> Subject: st: sample code in -heckprob-

<snip>

> It looks like a symmetric positive-definite matrix is first
> geneated, then is decomposed into the product of lower triangular
> matrix and its transpose, whose [2,1] and [2,2] elements are
> used to construct correlated two (bivariate normal) error
> terms.
>
> But I don't see why Cholesky decomposition is required here
> to create these error terms.
> I'm afraid no persuasive explanations can be found in the
> manual, though it might be just a standard routine for
> programmers.
>

Generating Correlated Random Normal Variables

Suppose we have 3 independent normal variables x_i with zero mean, unit
variance, and zero covariance; X ~ MN(0, I). We want to transform so that
they will have correlation structure P.

:         P
[symmetric]
1     2     3
+-------------------+
1 |    1              |
2 |  .25     1        |
3 |   .5   .75     1  |
+-------------------+

Since P is a symmetric nonnegative definite matrix it can be decomposed into

P = AA'

where A is a lower triangular matrix.

If we multiply the original data X times A'

R = X*A'

The resulting variables in R will have the desired correlation.

I believe the argument is as follows:

cov(R) = E(R'*R) = E(A*X'*X*A') = A*E(X'*X)A = A*I*A' = P

Below is an example of generating 3 random variables with a given
correlation matrix, P (though this uses -mata-, it could translated to in
Stata 8.2).

--------------------------------------------------
clear
matrix C = (1, 0, 0\0, 1, 0\0, 0, 1)
corr2data x1 x2 x3, n(20000) corr(C) clear
corr x*
mata
P =(1, .25, .5\.25,1 , .75 \ .5, .75 , 1)
A = cholesky(P)
X= st_data( ., .)
R = X*A'
corr = correlation(R,1)
st_matrix ("corr", corr)
st_matrix ("R", R)
end
matrix list corr
svmat R
corr R*
-------------------------------------------------

. clear

. matrix C = (1, 0, 0\0, 1, 0\0, 0, 1)

. corr2data x1 x2 x3, n(20000) corr(C) clear
(obs 20000)

. corr x*
(obs=20000)

|       x1       x2       x3
-------------+---------------------------
x1 |   1.0000
x2 |   0.0000   1.0000
x3 |  -0.0000  -0.0000   1.0000

. mata
------------------------------------------------- mata (type end to exit) --
:         P =(1, .25, .5\.25,1 , .75 \ .5, .75 , 1)

:         A = cholesky(P)

:         X= st_data( ., .)

:         R = X*A'

:         corr = correlation(R,1)

:         st_matrix ("corr", corr)

:         st_matrix ("R", R)

: end
----------------------------------------------------------------------------

. matrix list corr

symmetric corr[3,3]
c1   c2   c3
r1    1
r2  .25    1
r3   .5  .75    1

. svmat R

. corr R*
(obs=20000)

|       R1       R2       R3
-------------+---------------------------
R1 |   1.0000
R2 |   0.2500   1.0000
R3 |   0.5000   0.7500   1.0000

I hope this helps,
Scott

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```