Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down at the end of May, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Tirthankar Chakravarty <tirthankar.chakravarty@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: RES: generating a variable with pre-specified correlations with other two (given) variables |

Date |
Wed, 31 Aug 2011 05:47:30 -0700 |

Throw in some orthogonal, zero mean noise when constructing Z: g z = .15625*x+.40625*y + runiform() T On Wed, Aug 31, 2011 at 5:41 AM, fjc <fjc120@gmail.com> wrote: > Thanks, Tirthankar. > > This answers my question as originally posted. > > Now, something I didn't say in my earlier post (and I think I should > have) is that after I generate the new variable (z) I would like to > run a regression of y on x and z. But if I generate z in the way you > propose, I will get perfect collinearity. ¿Is there any other way to > generate z without getting this collinearity? > > Francisco. > > P.D. The reason I want to run the aforementioned regression is the > following. Suppose I have an initial regression of y on x, and x turns > out to be insignificantly different from zero at some chosen > confidence level. Then I want to generate an example in which adding a > new (artificial) variable z as a covariate I can get x to become > significantly different from zero at the same confidence level. Based > on the formula for the t-test, I think I can do this if I can control > the correlations between the artificial variable and the original > ones. The excercise is just for expositional purposes, I do not want > to attach any deep meaning to it. > > > On Wed, Aug 31, 2011 at 9:00 AM, Tirthankar Chakravarty > <tirthankar.chakravarty@gmail.com> wrote: >> This question has appeared a few times before - in that you want to >> create a variable with a pattern of correlation with _existing_ >> variables, which -corr2data- does not do. In an example where means >> are normalised to zero, this can be had by solving a system of linear >> equations in appropriate expectations. >> >> Suppose you generate a variable as >> >> Z = a*X+ b*Y ---(0) >> >> where a, and b are constants to be determined. Then you can derive the >> following identities under the zero mean assumption: >> >> Cov(Z, X) = a*Var(X) + b*Cov(X, Y) ---(1) >> Cov(Z, Y) = b*Var(Y) + a*Cov(X, Y) ---(2) >> >> Here you know everything (you set Cov(Z, X) and Cov(Z, Y)), and this >> is a system of two equations in two unknowns, a and b. Solve them and >> generate your variables as in equation (0). >> >> So for example, if I have Cov(X, Y) = .6, and Var(X)=Var(Y)=1, then a >> =0.15625 , b=0.40625. >> /************************************/ >> mat mCov = (1, .6\ .6, 1) >> // generate x and y >> corr2data x y, cstorage(full) cov(mCov) n(100000) clear >> // generate z based on current sample of x and y >> g z = .15625*x+.40625*y >> corr, covariance >> /************************************/ >> >> All these calculations are assuming zero means - more tedious algebra >> will allow you to generalise. >> >> T >> >> On Wed, Aug 31, 2011 at 3:53 AM, Henrique Neder <hdneder@ufu.br> wrote: >>> Try corr2data: >>> >>> matrix C = (1,0,.80,-.80\0,1,0,0\.80,0,1,-.80\-.80,0,-.80,1) >>> corr2data hsperc corzer1 corpos1 corneg1, n(4137) corr(C) >>> >>> Henrique Neder >>> >>> >>> -----Mensagem original----- >>> De: owner-statalist@hsphsun2.harvard.edu >>> [mailto:owner-statalist@hsphsun2.harvard.edu] Em nome de fjc >>> Enviada em: terça-feira, 30 de agosto de 2011 23:00 >>> Para: statalist@hsphsun2.harvard.edu >>> Assunto: st: generating a variable with pre-specified correlations with >>> other two (given) variables >>> >>> Dear Statalisters: >>> >>> I have a dataset with two variables, x and y. >>> >>> I would like to generate a new artificial variable, z, with >>> pre-specified correlations with x and y (no particular distribution >>> required). >>> >>> Any help would be greatly appreciated. >>> >>> Best, >>> >>> Francisco. >>> >>> P.D. I'm using Stata 11 (on Windows XP) >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/statalist/faq >>> * http://www.ats.ucla.edu/stat/stata/ >>> >>> ----- >>> Nenhum vírus encontrado nessa mensagem. >>> Verificado por AVG - www.avgbrasil.com.br >>> Versão: 10.0.1392 / Banco de dados de vírus: 1520/3868 - Data de Lançamento: >>> 08/30/11 >>> >>> >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/statalist/faq >>> * http://www.ats.ucla.edu/stat/stata/ >>> >> >> >> >> -- >> Tirthankar Chakravarty >> tchakravarty@ucsd.edu >> tirthankar.chakravarty@gmail.com >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ >> > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > -- Tirthankar Chakravarty tchakravarty@ucsd.edu tirthankar.chakravarty@gmail.com * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: generating a variable with pre-specified correlations with other two (given) variables***From:*fjc <fjc120@gmail.com>

**st: RES: generating a variable with pre-specified correlations with other two (given) variables***From:*"Henrique Neder" <hdneder@ufu.br>

**Re: st: RES: generating a variable with pre-specified correlations with other two (given) variables***From:*Tirthankar Chakravarty <tirthankar.chakravarty@gmail.com>

**Re: st: RES: generating a variable with pre-specified correlations with other two (given) variables***From:*fjc <fjc120@gmail.com>

- Prev by Date:
**Re: st: RES: generating a variable with pre-specified correlations with other two (given) variables** - Next by Date:
**Re: st: RES: generating a variable with pre-specified correlations with other two (given) variables** - Previous by thread:
**Re: st: RES: generating a variable with pre-specified correlations with other two (given) variables** - Next by thread:
**Re: st: RES: generating a variable with pre-specified correlations with other two (given) variables** - Index(es):