# Re: st: estimating data for a given correlation coefficient

 From Nicola Orsini <[email protected]> To [email protected] Subject Re: st: estimating data for a given correlation coefficient Date Thu, 20 Jan 2005 16:10:11 +0100

Hi Donald,

it should be something here http://www.ats.ucla.edu/stat/stata/

Something like this (copy and paste in a do-file)

program define lcs
version 7
args corr n
preserve

tempvar y x z w yhat

quietly {
set obs `n'
generate `y' = invnorm(uniform())*4.4 + 55
generate `z' = invnorm(uniform())*4.4 + 55
generate `x' = sqrt(1 - (`corr' )^2)*`z' + `corr' *`y'
egen `w'= std(`x')
replace `x' = `w'*4.4 + 55
regress `y' `x'
predict `yhat'
label var `y' "Y"
label var `x' "X"

}

local corr = string(`corr' , "%4.2f")

graph7 `y' `yhat' `x', s(oi) c(.l) /*
*/ ylabel(40 45 to 70) xlabel(40 45 to 70) /*
*/ t1("Pearson Correlation  r = `corr'  Observations = `n'") sort

// if you want to save the variables y and x
* gen y = `y'
* gen x = `x'

end

* For example to create y and x correlated with r = 0.80 with n = 200

lcs 0.80 200

* To see what happens when r ranges from -1 to 1

forvalues r = -1(.1)1 {
lcs `r' 200
sleep 100
// saves variables, dataset and do whatever
}

Hope it helps
Regards
Nicola

Dear all
I want to create some datasets of, say 200 observations, with defined correlations. How do I go about doing it. I could not find anything in Stata FAQ or the manuals. Is it possible to do this within Stata.
Many thanks.

Department of Pediatrics & Public Health Sciences
2C3.92 WMC, University of Alberta
T6G 2R7

780-407-1244:O
780-407-7136:F

Nature has no reset button.

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/