Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down at the end of May, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Richard Williams <richardwilliams.ndu@gmail.com> |

To |
statalist@hsphsun2.harvard.edu, statalist@hsphsun2.harvard.edu |

Subject |
Re: st: RES: generating a variable with pre-specified correlations with other two (given) variables |

Date |
Wed, 31 Aug 2011 19:23:10 -0500 |

mat mCorr = (1, .6, .4\ .6, 1, .5 \ .4, .5, 1) corr2data x y z, cstorage(full) corr(mCorr) n(100000) clear corr reg z x y

At 03:23 PM 8/31/2011, fjc wrote:

Hi, Thank you all for the quick and useful responses. 1. I can do with covariances instead of correlations, so the methods proposed by Tirthankar and Richard work fine. 2. Still, if I wanted to stick to correlations, I think one can apply the same ideas (as suggested in the previous responses): Let z be given by (0) z = a * x + b * y + c * u, where x and y are the two variables in the dataset and u is a zero-mean random variable independent of x and y. From (0) one gets: (1) Corr(x,z) = a * sd(x)/sd(z) + b * sd(y)/sd(z) * Corr(x,y) (2) Corr(y,z) = b * sd(y)/sd(z) + a * sd(x)/sd(z) * Corr(x,y)(3) Var(z) = a^2 * Var(x) + b^2 * Var(y) + c^2* Var(u) + 2 * a * b * Cov(x,y)Once we have chosen Corr(x,z), Corr(y,z) and Var(z), we can solve the system above for a, b, and c. Actually, equations (1) and (2) can be solved for a and b to get: a = [sd(z)/sd(x)] * [Corr(x,z) - Corr(x,y)*Corr(y,z)] / (1 - Corr(x,y)^2) b = [sd(z)/sd(y)] * [Corr(y,z) - Corr(x,y)*Corr(x,z)] / (1 - Corr(x,y)^2) Then we can use (3) to obtain the value of c. Finally, we can use (0) to generate z. Thanks again, Francisco. On Wed, Aug 31, 2011 at 3:59 PM, Richard Williams <richardwilliams.ndu@gmail.com> wrote: > At 07:41 AM 8/31/2011, fjc wrote: >> >> Thanks, Tirthankar. >> >> This answers my question as originally posted. >> >> Now, something I didn't say in my earlier post (and I think I should >> have) is that after I generate the new variable (z) I would like tow >> run a regression of y on x and z. But if I generate z in the way you >> propose, I will get perfect collinearity. Å¼Is there any other way to >> generate z without getting this collinearity? > > Slightly tweaking the earlier example, does this do what you want? > > mat mCorr = (1, .6, .4\ .6, 1, .5 \ .4, .5, 1) > corr2data x y z, cstorage(full) corr(mCorr) n(100000) clear > corr > reg z x y > > Again, mCorr is a combo of the given correlations for x and y with the > desired correlations for z. If you want, you can also specify standard > deviations and means, both observed (for x and y) and desired (for z). I am> faking all the data, although thecorrelations etc. can come from real data.> If you want to do some combo of fake and real (e.g. generate a z using the > realx and realy) it can probably be done but would take a bit more work. > > > ------------------------------------------- > Richard Williams, Notre Dame Dept of Sociology > OFFICE: (574)631-6668, (574)631-6463 > HOME: Â (574)289-5227 > EMAIL: Â Richard.A.Williams.5@ND.Edu > WWW: Â Â http://www.nd.edu/~rwilliam > > > * > * Â For searches and help try: > * Â http://www.stata.com/help.cgi?search > * Â http://www.stata.com/support/statalist/faq > * Â http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

------------------------------------------- Richard Williams, Notre Dame Dept of Sociology OFFICE: (574)631-6668, (574)631-6463 HOME: (574)289-5227 EMAIL: Richard.A.Williams.5@ND.Edu WWW: http://www.nd.edu/~rwilliam * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: generating a variable with pre-specified correlations with other two (given) variables***From:*fjc <fjc120@gmail.com>

**st: RES: generating a variable with pre-specified correlations with other two (given) variables***From:*"Henrique Neder" <hdneder@ufu.br>

**Re: st: RES: generating a variable with pre-specified correlations with other two (given) variables***From:*Tirthankar Chakravarty <tirthankar.chakravarty@gmail.com>

**Re: st: RES: generating a variable with pre-specified correlations with other two (given) variables***From:*fjc <fjc120@gmail.com>

**Re: st: RES: generating a variable with pre-specified correlations with other two (given) variables***From:*Richard Williams <richardwilliams.ndu@gmail.com>

**Re: st: RES: generating a variable with pre-specified correlations with other two (given) variables***From:*fjc <fjc120@gmail.com>

- Prev by Date:
**st: smcl compatibility** - Next by Date:
**Re: st: Question about pstest (after running psmatch2)** - Previous by thread:
**Re: st: RES: generating a variable with pre-specified correlations with other two (given) variables** - Next by thread:
**st: Is it OK to interact an instrumented variable?** - Index(es):