[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: RE: strange results with corr |

Date |
Thu, 17 Feb 2005 16:32:01 -0000 |

It is simpler than you seem to fear. Think in terms of a scatter plot. With two dummy variables, the possible data points in general are the 4 corners of the unit square. The correlation treating these numerically will have modulus 1 if and only if the points populated in practice are just the two opposite corners. That is, with * 1,1 * 0,0 the correlation would be 1, and with * 0,1 * 1, 0 the correlation would be -1. In either case a straight line would be a perfect fit to the data, irrespective of how many data points fall on each corner, so long as some do. In practice, with your dataset data fall on 3 out of 4 corners, and we can't say anything so simple: the result of the correlation will depend on the votes cast, as it were. With this election result 2458 11119 739 the best-fit line would clearly tilt downwards, but fairly gently, so the correlation looks fine by me, qua correlation. Nick n.j.cox@durham.ac.uk Kenley Barrett > I'm sorry, I should have included all possible counts. I have pasted > them below. To be sure that I understand properly: this correlation > coefficient is due the fact that although a value of 1 for wifelit > guarantees a value of 0 for wifeprim, and a value of 1 for wifeprim > guarantees a value of 0 for wifelit, a value of 0 for wifeprim does > NOT guarantee a value of 1 for wifelit, and a value of 0 for wifelit > does NOT guarantee a value of 1 for wifeprim. So the correlation > coefficient should not be -1 (as I was thinking earlier). Could you > please confirm for me that I'm understanding this right? I'm sorry to > bother you again; I am new at this, as you can tell. > > . count if wifelit == 1 & wifeprim == 1 > 0 > > . count if wifelit == 0 & wifeprim == 1 > 2458 > > . count if wifelit == 0 & wifeprim == 0 > 11119 > > . count if wifelit == 1 > 739 > > . count if wifeprim == 1 > 2458 > > > . count if wifelit == 1 & wifeprim == 0 > 739 > > . corr wifelit wifeprim > (obs=14316) > > | wifelit wifeprim > -------------+------------------ > wifelit | 1.0000 > wifeprim | -0.1062 1.0000 > On Thu, 17 Feb 2005 15:58:32 -0000, Nick Cox > <n.j.cox@durham.ac.uk> wrote: > > You evidently have two dummies here, both 0 or 1. > > > > You give two of the four possible > > counts, from which we can infer that > > in 14316 - 2458 cases the values are 1 0 or 0 0. > > > > That seems entirely consistent with the correlation > > you get. The entire 2 by 2 table from -tab wifeprim > > wifelit- is the context for the correlation. > > > > Nick > > n.j.cox@durham.ac.uk > > > > Kenley Barrett > > > > > I am getting strange results when I run the "corr" command on my > > > variables. From my understanding, "corr" gives the correlation > > > coefficient, so if a value of 1 for Dummy Variable A guarantees a > > > value of 0 for Dummy Variable B, then corr should give a > result of -1. > > > But instead I am getting values between 0 and -1. A sample of two > > > variables shown below: > > > > > > . count if wifelit == 1 & wifeprim == 1 > > > 0 > > > > > > . count if wifelit == 0 & wifeprim == 1 > > > 2458 > > > > > > . corr wifelit wifeprim > > > (obs=14316) > > > > > > | wifelit wifeprim > > > -------------+------------------ > > > wifelit | 1.0000 > > > wifeprim | -0.1062 1.0000 > > > > > > What could be the problem? Am I misunderstanding the corr command? * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**st: RE: error in rocfit?** - Next by Date:
**st: addplot option to tmap** - Previous by thread:
**Re: st: RE: strange results with corr** - Next by thread:
**RE: st: RE: strange results with corr** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |