# Re: st: clogit data format

 From Margaret R Grove To statalist@hsphsun2.harvard.edu Subject Re: st: clogit data format Date Fri, 21 Mar 2008 12:55:02 -0400

Thanks for the references, Nick. I'm afraid I'm mostly a programmer and do fairly basic analyses. I can fairly well manipulate data with Stata into whatever shape is needed and have done lots of automated table generation. I'm now into some stuff I'm not so comfortable with and which I've been asked to do with the choices having been pre-defined. That makes it difficult to respond to your questions. However, it's most helpful to me that you're asking them to give me more to chew on. Yes, the original ratings were not dichotomized. However, the number of abnormal ratings is small in comparison to the "normals" so in the 4 cases I'm looking at it probably does make sense. We looked at the data in many ways before distilling it down to kappas between reader pairs. The conditional logistic regression is, as I understand it, an attempt to obtain a p-value describing the distribution of responses between normal and abnormal to satisfy reviewers' requests.

m

Nick Cox wrote:

Whoa! What is this "before we dichotomised it"? If you mean that your 0s
and 1s are not the original ratings aren't you just throwing away information?
Anyway, I have never understood all the enthusiasm for kappa, which
despite its clear definition is just a single summary measure. I have to suspect that the
lure of a single, supposedly simple summary, which comes wrapped up
nicely with a P-value attached, sometimes triumphs over the challenge of
looking at the fine structure of agreement (or more precisely
disagreement).
I don't know what your precise problem is but I have tackled what may be
similar ones. For example, in a couple of papers I have looked at
problems in the Earth sciences in which several methods were used to
measure what should be the same thing. Scientifically, putting a single
number on the strength of overall agreement has never seemed a terribly
useful thing to do. If overall agreement is extremely high, it is clear
that the methods do all agree very well, but what's more typical in my
experience is that the agreement is moderate or worse. In that situation
the real challenge is to try to identify (e.g.) whether one or two
methods are really out of line with the others. Naturally it need not be
a voting matter if one method is in some sense known to be very good
(even a 'gold standard') and the others are poor. Admittedly if you are
dealing with the ratings given by various medics then implying that one
or more may not be so competent could be a difficult matter.
There is more in the same spirit (statistically, not politically) at
SJ-4-3 gr0005 . . . . . Speaking Stata: Graphing agreement and
disagreement
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N.
J. Cox
Q3/04 SJ 4(3):329--349 (no
commands)
how to select the right graph to portray comparison or
assessment of agreement or disagreement between data
measured on identical scales

which is now in the public domain via
http://www.stata-journal.com/sjpdf.html?articlenum=gr0005

and in
Cox, N.J. 2006. Assessing agreement of measurements and predictions in
geomorphology. Geomorphology 76: 332-346 doi:10.1016/j.geomorph.2005.12.001
which may or may not be accessible to you.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
```*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```