Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: RES: generating a variable with pre-specified correlations with other two (given) variables

From   Nick Cox <[email protected]>
To   [email protected]
Subject   Re: st: RES: generating a variable with pre-specified correlations with other two (given) variables
Date   Wed, 31 Aug 2011 14:14:12 +0100

Richard's question is the more crucial one, but I guess that
Tirthankar meant -rnormal()-. Adding -runiform()- will add 0.5 on
average (although that could easily be fixed). Either way, adding
noise will reduce the correlations.


On Wed, Aug 31, 2011 at 3:01 PM, Richard Williams
<[email protected]> wrote:
> At 07:47 AM 8/31/2011, Tirthankar Chakravarty wrote:
>> Throw in some orthogonal, zero mean noise when constructing Z:
>> g z = .15625*x+.40625*y + runiform()
> I believe that will zap the correlations though, won't it? i.e. the
> correlations of z with x and y will get smaller.
>> > P.D. The reason I want to run the aforementioned regression is the
>> > following. Suppose I have an initial regression of y on x, and x turns
>> > out to be insignificantly different from zero at some chosen
>> > confidence level. Then I want to generate an example in which adding a
>> > new (artificial) variable z as a covariate I can get x to become
>> > significantly different from zero at the same confidence level. Based
>> > on the formula for the t-test, I think I can do this if I can control
>> > the correlations between the artificial variable and the original
>> > ones. The excercise is just for expositional purposes, I do not want
>> > to attach any deep meaning to it.
> If this is just for expositional purposes, it would probably be easier just
> to fake all the data with corr2data, rather than trying to create a combo of
> fake and real data. I think you could add a variable e that had 0
> correlation with x and y and nonzero correlation with z. I generally find it
> is easier to get fake data to behave the way I want rather than real data.
*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index