# st: RE: Re: Econometrics Theory Questions on Dummies and Correlation Analysis

 From "Nick Cox" To Subject st: RE: Re: Econometrics Theory Questions on Dummies and Correlation Analysis Date Tue, 19 Apr 2005 10:33:48 +0100

```Paul seems to be implying that whether a binary variable
is nominal is somehow deeper or more fundamental than it
being binary. I don't accept that at all.

To repeat an earlier example:

Suppose you have two identical
dummy variables (and some variation in each).
In terms of a scatter plot, you have two clusters,
one at the origin (0,0) and one at (1,1), like this

*

*

and a straight line is a perfect summary of such
data, and so the Pearson correlation is identically 1.
The graph above is label-free and deliberate so,
as the result holds irrespective of coding. I could
code the two levels as 7 and 42 or any other distinct
numbers and the correlation is unchanged.  And
I don't see any objection to calling that a linear
relationship.

Nick
n.j.cox@durham.ac.uk

Paul Millar

> What fun this all is!   Who'd have thought!  Thanks for the
> fun with fundamentals!
>
> I think what Sam was getting at is that with binary
> variables, once you have the mean, you can throw away the
> data since the variance is directly derived from the mean.
> Nothing further is required, even to calculate confidence intervals.
>
> And I think Nick's response indicates why the level of
> measurement is relevant.  If the LOM is nominal, there is no
> linear relationship, strictly speaking.  Only when the scales
> are equi-interval does a linear relationship, and thus the
> correlation make theoretical sense; the correlation being a
> summary of the linear relationship, as Nick points out.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```