Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Strange behaviour of -correlate- command


From   Zurab Sajaia <zsajaia@hotmail.com>
To   statalist <statalist@hsphsun2.harvard.edu>
Subject   RE: st: Strange behaviour of -correlate- command
Date   Thu, 9 Dec 2010 19:50:53 -0500

You're absolutely right, and my manually calculated used mean of the prod i.e. dividing by n instead of (n-1), my bad, going home now :$.
 
Thanks a lot,
Zurab


----------------------------------------
> Subject: Re: st: Strange behaviour of -correlate- command
> From: sandersn@stanford.edu
> Date: Thu, 9 Dec 2010 16:30:54 -0800
> To: statalist@hsphsun2.harvard.edu
>
> If I recall correctly, Excel doesn't calculate the COVAR quite right. For some reason, it uses (1/n) rather than (1/n-1). That likely explains your odd results.
>
> --
> Nicholas J. Sanders, Ph.D.
> Postdoctoral Fellow
> Stanford Institute for Economic Policy Research
> 366 Galvez St, Room 228
> Stanford, CA 94305
>
> On Dec 9, 2010, at 4:23 PM, Zurab Sajaia wrote:
>
> > Dear all,
> >
> > I've encountered a problem for which I can't find an explanation so far, it seems that I'm getting wrong estimates of covariance, results differ if I use -correlate- command or do calculations manually (I tried exporting data to Excel and used COVAR() function there and it seems that Excel is on my side),
> > so I was wandering whether something is indeed wrong in Stata, or I'm doing it incorrectly (perhaps it's time to stop working and go home?)...
> >
> > So here the deal, I've uploaded an example dataset to the web (30kb):
> >
> > .use http://www.adeptanalytics.org/download/temp/corr_bug.dta, clear
> >
> > .corr y r, c
> > (obs=2419)
> > | y r
> > -------------+------------------
> > y | 2.8e+07
> > r | 1142.05 .083368
> >
> >
> >
> > but if I do it manually:
> >
> > .summarize y, meanonly
> > .generate double y1 = y - r(mean)
> >
> > .summarize r, meanonly
> > generate double r1 = r - r(mean)
> >
> > generate double prod = y1 * r1
> >
> > summarize prod
> > Variable | Obs Mean Std. Dev. Min Max
> > -------------+--------------------------------------------------------
> > prod | 2419 1141.579 2152.761 -53.76514 47015.59
> >
> >
> > The same result (1141.579) I get using Excel's COVAR() function.
> > Do you have any ideas what can be happening here?
> >
> > Thanks,
> > Zurab
> >
> > *
> > * For searches and help try:
> > * http://www.stata.com/help.cgi?search
> > * http://www.stata.com/support/statalist/faq
> > * http://www.ats.ucla.edu/stat/stata/
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/ 		 	   		  
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index