# RE: st: RE: strange results with corr

 From "Nick Cox" To Subject RE: st: RE: strange results with corr Date Thu, 17 Feb 2005 16:32:01 -0000

```It is simpler than you seem to fear.
Think in terms of a scatter plot. With two dummy
variables, the possible data points in
general are the 4 corners of the unit
square. The correlation treating these
numerically will have modulus 1 if and
only if the points populated in practice
are just the two opposite corners.

That is, with

* 1,1

* 0,0

the correlation would be 1, and with

* 0,1

* 1, 0

the correlation would be -1. In either
case a straight line would be a perfect
fit to the data, irrespective of how
many data points fall on each corner,
so long as some do.

In practice, with your dataset data fall on
3 out of 4 corners, and we can't say anything
so simple: the result of the correlation
will depend on the votes cast, as it were.
With this election result

2458

11119    739

the best-fit line would clearly tilt downwards,
but fairly gently, so the correlation looks fine by me,
qua correlation.

Nick
n.j.cox@durham.ac.uk

Kenley Barrett

> I'm sorry, I should have included all possible counts. I have pasted
> them below. To be sure that I understand properly: this correlation
> coefficient is due the fact that although a value of 1 for wifelit
> guarantees a value of 0 for wifeprim, and a value of 1 for wifeprim
> guarantees a value of 0 for wifelit, a value of 0 for wifeprim does
> NOT guarantee a value of 1 for wifelit, and a value of 0 for wifelit
> does NOT guarantee a value of 1 for wifeprim. So the correlation
> coefficient should not be -1 (as I was thinking earlier). Could you
> please confirm for me that I'm understanding this right? I'm sorry to
> bother you again; I am new at this, as you can tell.
>
> . count if wifelit == 1 & wifeprim == 1
>     0
>
> . count if wifelit == 0 & wifeprim == 1
>  2458
>
> . count if wifelit == 0 & wifeprim == 0
> 11119
>
> . count if wifelit == 1
>   739
>
> . count if wifeprim == 1
>  2458
>
>
> . count if wifelit == 1 & wifeprim == 0
>   739
>
> . corr wifelit wifeprim
> (obs=14316)
>
>              |  wifelit wifeprim
> -------------+------------------
>      wifelit |   1.0000
>     wifeprim |  -0.1062   1.0000

> On Thu, 17 Feb 2005 15:58:32 -0000, Nick Cox
> <n.j.cox@durham.ac.uk> wrote:
> > You evidently have two dummies here, both 0 or 1.
> >
> > You give two of the four possible
> > counts, from which we can infer that
> > in 14316 - 2458 cases the values are 1 0 or 0 0.
> >
> > That seems entirely consistent with the correlation
> > you get. The entire 2 by 2 table from -tab wifeprim
> > wifelit- is the context for the correlation.
> >
> > Nick
> > n.j.cox@durham.ac.uk
> >
> > Kenley Barrett
> >
> > > I am getting strange results when I run the "corr" command on my
> > > variables. From my understanding, "corr" gives the correlation
> > > coefficient, so if a value of 1 for Dummy Variable A guarantees a
> > > value of 0 for Dummy Variable B, then corr should give a
> result of -1.
> > > But instead I am getting values between 0 and -1. A sample of two
> > > variables shown below:
> > >
> > > . count if wifelit == 1 & wifeprim == 1
> > >     0
> > >
> > > . count if wifelit == 0 & wifeprim == 1
> > >  2458
> > >
> > > . corr wifelit wifeprim
> > > (obs=14316)
> > >
> > >              |  wifelit wifeprim
> > > -------------+------------------
> > >      wifelit |   1.0000
> > >     wifeprim |  -0.1062   1.0000
> > >
> > > What could be the problem? Am I misunderstanding the corr command?

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```