Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.

# Re: st: RES: generating a variable with pre-specified correlations with other two (given) variables

 From Tirthankar Chakravarty To statalist@hsphsun2.harvard.edu Subject Re: st: RES: generating a variable with pre-specified correlations with other two (given) variables Date Wed, 31 Aug 2011 05:14:23 -0700

```In the above example, I set Cov(Z, X) = .4, and Cov(Z, Y) = .5. You
can specify the correlations instead, but that makes solving for the
values of a and b slightly more tedious.

T

On Wed, Aug 31, 2011 at 5:00 AM, Tirthankar Chakravarty
<tirthankar.chakravarty@gmail.com> wrote:
> This question has appeared a few times before - in that you want to
> create a variable with a pattern of correlation with _existing_
> variables, which -corr2data- does not do. In an example where means
> are normalised to zero, this can be had by solving a system of linear
> equations in appropriate expectations.
>
> Suppose you generate a variable as
>
> Z = a*X+ b*Y ---(0)
>
> where a, and b are constants to be determined. Then you can derive the
> following identities under the zero mean assumption:
>
> Cov(Z, X) = a*Var(X) + b*Cov(X, Y)  ---(1)
> Cov(Z, Y) = b*Var(Y) + a*Cov(X, Y)  ---(2)
>
> Here you know everything (you set Cov(Z, X) and Cov(Z, Y)), and this
> is a system of two equations in two unknowns, a and b. Solve them and
> generate your variables as in equation (0).
>
> So for example, if I have Cov(X, Y) = .6, and Var(X)=Var(Y)=1, then a
> =0.15625 , b=0.40625.
> /************************************/
> mat mCov = (1, .6\ .6, 1)
> // generate x and y
> corr2data x y, cstorage(full) cov(mCov) n(100000) clear
> // generate z based on current sample of x and y
> g z = .15625*x+.40625*y
> corr, covariance
> /************************************/
>
> All these calculations are assuming zero means - more tedious algebra
> will allow you to generalise.
>
> T
>
> On Wed, Aug 31, 2011 at 3:53 AM, Henrique Neder <hdneder@ufu.br> wrote:
>> Try corr2data:
>>
>> matrix C = (1,0,.80,-.80\0,1,0,0\.80,0,1,-.80\-.80,0,-.80,1)
>> corr2data hsperc corzer1 corpos1 corneg1, n(4137) corr(C)
>>
>> Henrique Neder
>>
>>
>> -----Mensagem original-----
>> De: owner-statalist@hsphsun2.harvard.edu
>> [mailto:owner-statalist@hsphsun2.harvard.edu] Em nome de fjc
>> Enviada em: terça-feira, 30 de agosto de 2011 23:00
>> Para: statalist@hsphsun2.harvard.edu
>> Assunto: st: generating a variable with pre-specified correlations with
>> other two (given) variables
>>
>> Dear Statalisters:
>>
>> I have a dataset with two variables, x and y.
>>
>> I would like to generate a new artificial variable, z, with
>> pre-specified correlations with x and y (no particular distribution
>> required).
>>
>> Any help would be greatly appreciated.
>>
>> Best,
>>
>> Francisco.
>>
>> P.D. I'm using Stata 11 (on Windows XP)
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>> -----
>> Nenhum vírus encontrado nessa mensagem.
>> Verificado por AVG - www.avgbrasil.com.br
>> Versão: 10.0.1392 / Banco de dados de vírus: 1520/3868 - Data de Lançamento:
>> 08/30/11
>>
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>
>
>
> --
> Tirthankar Chakravarty
> tchakravarty@ucsd.edu
> tirthankar.chakravarty@gmail.com
>

--
Tirthankar Chakravarty
tchakravarty@ucsd.edu
tirthankar.chakravarty@gmail.com

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```