[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: sample code in -heckprob-

From	"Scott Merryman" <[email protected]>
To	<[email protected]>
Subject	st: RE: sample code in -heckprob-
Date	Mon, 5 Sep 2005 12:22:02 -0500

> -----Original Message-----
> From: [email protected] [mailto:owner-
> [email protected]] On Behalf Of Katsuhide Isa
> Sent: Tuesday, August 30, 2005 7:57 AM
> To: Statalist
> Subject: st: sample code in -heckprob-

<snip>
 
> It looks like a symmetric positive-definite matrix is first
> geneated, then is decomposed into the product of lower triangular
> matrix and its transpose, whose [2,1] and [2,2] elements are
> used to construct correlated two (bivariate normal) error
> terms.
> 
> But I don't see why Cholesky decomposition is required here
> to create these error terms.
> I'm afraid no persuasive explanations can be found in the
> manual, though it might be just a standard routine for
> programmers.
> 

Generating Correlated Random Normal Variables


Suppose we have 3 independent normal variables x_i with zero mean, unit
variance, and zero covariance; X ~ MN(0, I). We want to transform so that
they will have correlation structure P.

:         P
[symmetric]
         1     2     3
    +-------------------+
  1 |    1              |
  2 |  .25     1        |
  3 |   .5   .75     1  |
    +-------------------+


Since P is a symmetric nonnegative definite matrix it can be decomposed into


P = AA'

where A is a lower triangular matrix.

If we multiply the original data X times A' 

 R = X*A'

The resulting variables in R will have the desired correlation.  

I believe the argument is as follows:

cov(R) = E(R'*R) = E(A*X'*X*A') = A*E(X'*X)A = A*I*A' = P


Below is an example of generating 3 random variables with a given
correlation matrix, P (though this uses -mata-, it could translated to in
Stata 8.2).

--------------------------------------------------
clear
matrix C = (1, 0, 0\0, 1, 0\0, 0, 1)
corr2data x1 x2 x3, n(20000) corr(C) clear
corr x*
mata
	P =(1, .25, .5\.25,1 , .75 \ .5, .75 , 1)
	A = cholesky(P)
	X= st_data( ., .)
	R = X*A'
	corr = correlation(R,1)
	st_matrix ("corr", corr)
	st_matrix ("R", R)
end
matrix list corr
svmat R
corr R*
-------------------------------------------------

. clear

. matrix C = (1, 0, 0\0, 1, 0\0, 0, 1)

. corr2data x1 x2 x3, n(20000) corr(C) clear
(obs 20000)

. corr x*
(obs=20000)

             |       x1       x2       x3
-------------+---------------------------
          x1 |   1.0000
          x2 |   0.0000   1.0000
          x3 |  -0.0000  -0.0000   1.0000


. mata
------------------------------------------------- mata (type end to exit) --
:         P =(1, .25, .5\.25,1 , .75 \ .5, .75 , 1)

:         A = cholesky(P)

:         X= st_data( ., .)

:         R = X*A'

:         corr = correlation(R,1)

:         st_matrix ("corr", corr)

:         st_matrix ("R", R)

: end
----------------------------------------------------------------------------

. matrix list corr

symmetric corr[3,3]
     c1   c2   c3
r1    1
r2  .25    1
r3   .5  .75    1

. svmat R

. corr R*
(obs=20000)

             |       R1       R2       R3
-------------+---------------------------
          R1 |   1.0000
          R2 |   0.2500   1.0000
          R3 |   0.5000   0.7500   1.0000




I hope this helps,
Scott



*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: st: RE: Why does TRUNCREG cannot find the MLE estimate?
Next by Date: Re: st: Concentration Curve and Index
Previous by thread: st: Estimate marginal effects with interaction terms in non-linnear models.
Next by thread: st: predictive model
Index(es):
- Date
- Thread