Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Can Spearman's rho be used to measure of the degree of association between two binary variables ?


From   Dirk Enzmann <dirk.enzmann@uni-hamburg.de>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Can Spearman's rho be used to measure of the degree of association between two binary variables ?
Date   Mon, 21 May 2012 15:47:44 +0200

Marcos Vinicius asked, whether Spearman's rho can be used to measure the degree of association between two binary variables, see:

http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/statalist.1205/date/article-836.html

to which he received very helpful answers (perhaps more than he initially wanted to know):

- from Maarten Buis and Richard Williams
http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/statalist.1205/date/article-855.html
http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/statalist.1205/date/article-859.html
who discuss whether multicollinearity should be regarded as a problem at all

- and from Cameron McIntosh, Richard Stoll, and Roger Newson
http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/statalist.1205/date/article-837.html
http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/statalist.1205/date/article-838.html
http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/statalist.1205/date/article-858.html
who point at alternative measures for the association of two binary variables.

Let me add two comments as to the use of tetrachoric (or polychoric) correlation coefficients vs. Pearson (or Spearman) correlation coefficients in the case of two binary variables: 1) Tetrachoric and Pearson correlations answer different questions: The first estimates the correlation of two *latent* (quasi-continuous) variables "behind" the observed dichotomous variables, thus assuming that both variables are artificially dichotomous, whereas the Pearson coefficients shows how the *observed* values correlate, thus assuming naturally dichotomous variables. 2) Ledesma et al. (2011) (see: http://openjournal.konradlorenz.edu.co/index.php/rlpsi/article/viewFile/459/463 ) cited by Cameron write: "Stata gives their users a function based on a work by Edwards and Edwards (1984), that is basically 'a very rough approximation' and is consequently unsuitable for many applications ..." (p. 182). This is not quite correct: Stata computes the tetrachoric correlation "... by using the Edwards and Edwards (1984) noniterative estimator as the *initial* value" (see: [R] Base Referencence, p. 2196; bold face by me) - the coefficient calculated by Stata is at least as precise as the coefficient calculated by Ledesma et al.'s Vista-Tetrachor program:

* --- Example from Ledesma et al., p. 183 ------------------------
input y x ncases
1 1 203
1 0 186
0 1 167
0 0 374
end
expand ncases
tetrachoric x y
* --- End of Stata example ---------------------------------------

Dirk

========================================
Dr. Dirk Enzmann
Institute of Criminal Sciences
Dept. of Criminology
Rothenbaumchaussee 33
D-20148 Hamburg
Germany

phone: +49-(0)40-42838.7498 (office)
       +49-(0)40-42838.4591 (Mrs Billon)
fax:   +49-(0)40-42838.2344
email: dirk.enzmann@uni-hamburg.de
http://www2.jura.uni-hamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/Enzmann.html
========================================
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index