Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Strange behaviour of -correlate- command


From   Nick Sanders <sandersn@stanford.edu>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Strange behaviour of -correlate- command
Date   Thu, 9 Dec 2010 16:30:54 -0800

If I recall correctly, Excel doesn't calculate the COVAR quite right. For some reason, it uses (1/n) rather than (1/n-1). That likely explains your odd results.

--
Nicholas J. Sanders, Ph.D.
Postdoctoral Fellow
Stanford Institute for Economic Policy Research
366 Galvez St, Room 228
Stanford, CA 94305

On Dec 9, 2010, at 4:23 PM, Zurab Sajaia wrote:

> Dear all,
> 
> I've encountered a problem for which I can't find an explanation so far, it seems that I'm getting wrong estimates of covariance, results differ if I use -correlate- command or do calculations manually (I tried exporting data to Excel and used COVAR() function there and it seems that Excel is on my side), 
> so I was wandering whether something is indeed wrong in Stata, or I'm doing it incorrectly (perhaps it's time to stop working and go home?)...
> 
> So here the deal, I've uploaded an example dataset to the web (30kb):
> 
> .use http://www.adeptanalytics.org/download/temp/corr_bug.dta, clear
> 
> .corr y r, c
> (obs=2419)
>             |        y        r
> -------------+------------------
>           y |  2.8e+07
>           r |  1142.05  .083368
> 
> 
> 
> but if I do it manually:
> 
> .summarize y, meanonly
> .generate double y1 = y - r(mean)
> 
> .summarize r, meanonly
> generate double r1 = r - r(mean)
> 
> generate double prod = y1 * r1
> 
> summarize prod
>    Variable |       Obs        Mean    Std. Dev.       Min        Max
> -------------+--------------------------------------------------------
>        prod |      2419    1141.579    2152.761  -53.76514   47015.59
> 
> 
> The same result (1141.579) I get using Excel's COVAR() function.
> Do you have any ideas what can be happening here?
> 
> Thanks,
> Zurab
>  		 	   		  
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index