Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: question: command pwcorr


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   RE: st: question: command pwcorr
Date   Thu, 30 Oct 2003 17:47:38 -0000

May Boggess replied to xiaoming wang

> > I want to retrieve all correlation coefficients from
> command PWCORR.
> >
> > How to do it?
>
> The command -pwcorr- does not save much in the way of
> results. So to get
> hold of all the correlations, we need to calculate them ourselves.
>
> Below is some code that calculates the pairwise correlation
> coefficients
> and stores them in a matrix. It also saves, in another matrix, the
> number of observations used in the computation of each correlation.
>
> This is very important because this is the big difference
> between -corr-
> and -pwcorr-: -pwcorr- uses as many observations as possible in the
> computation of the correlation coefficient of each pair,
> whereas -corr-
> only uses those observations that are complete (no missing
> data) for all
> variables in the variable list.
>
> *------ generate some data-----------
>
> clear
> set obs 100
> local varnum=5
>
> forvalues i=1/`varnum'{
> gen var`i'=uniform()
> }
>
> replace var1=. if _n>80
> replace var4=. if _n<10
>
> *--------calculate correlations--------
>
> matrix C = I(`varnum')
> matrix N = I(`varnum')
>
> forvalues i=1(1)`varnum' {
> 	forvalues j=1(1)`i'{
> 	capture  corr var`i' var`j'
> 	if _rc==0{
> 	matrix C[`i',`j']=r(rho)
> 	matrix C[`j',`i']=r(rho)
> 	matrix N[`i',`j']=r(N)
> 	matrix N[`j',`i']=r(N)
> 	}
> 	else{
>  	matrix C[`i',`j']=.
>  	matrix C[`j',`i']=.
> 	matrix N[`i',`j']=.
> 	matrix N[`j',`i']=.
> 	}
> 	matrix C[`i',`i']=1
> 	matrix N[`i',`i']=r(N)
> 	}
> }
>
>
> *---display results-------------
> display as text "sample correlations"
> matrix list C, format(%6.4f) noheader
> display as text "number of observations"
> matrix list N,   noheader
>
> *--- check its the same as from pwcorr-----
> pwcorr var1 var2 var3 var4 var5
>
>
> As you can see, for this code to work easily, the variables
> need to be
> named something nice. You can do this by renaming them, and
> if you want
> the dataset to return to it's previous state once you're done, put
> -preserve- at the start and -restore- at the end.
>
> What's the easiest way to change a bunch of variables names? Try
> something like this:
>
>
> *--- get some data--------
> clear
> sysuse auto
> keep mpg price weight length
>
> *---- my list of variables------
> local varlist="price mpg weight length"
>
> *-----rename them----
> tokenize `varlist'
> local i=1
> while "`1'"!=""{
> rename `1' var`i'
> local i=`i'+1
> macro shift
> }
>

May is explaining a solution from first principles,
always important to know when no canned solution
exists, and even when it does. In addition, her code is
efficient because it exploits the fact that
corr(x,y) = corr(y,x).

To spell out the comparison between this
approach and my earlier answer to this
question mentioning a quick-and-dirty use of
a canned solution, my solution was equivalent to (I'll adopt
May's notation)

makematrix C, from(r(rho)) listwise : correlate <varlist>

As she implies, the matrix of Ns is important too,
so that would need

makematrix N, from(r(N)) listwise : correlate <varlist>

Doing it twice is certainly inefficient in computer
time, but in total requires a bit less typing, and
makes no assumptions about the form of the varlist.

Nick
[email protected]

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index