Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: reliability with -icc- and -estat icc-

From	[email protected]
To	[email protected]
Subject	Re: st: reliability with -icc- and -estat icc-
Date	Wed, 27 Feb 2013 15:06:37 -0000

Lenny Lesser wrote:

I have 4 raters that gave a score of 0-100 on 11 smartphone
applications.
The data is skewed right, as they all got low scores.  I'm using the
ranks (within an individual) instead of the actual scores.  I want to
know the correlation in ranking between the different raters.

I've tried the two commands:

-xtmixed rank Application || Rater: , reml
-estat icc

(icc=0.19)

and

-icc rank Rater Application, mixed consistency

(icc=0.34)

They give me two different answers. Which one is correct?

Next, we found out that rater 4 was off the charts, and we want to
eliminate her and rerun the analysis. When we do this we get wacky
ICCs.  In the first method we get an ICC of 2e-26.  In the 2nd method
(-icc), we get -.06.  Eliminating any of the other raters gives us
ICCs close to the original ICC.  Why are we getting such a crazy
number when we eliminate this 4th rater?

I'm guessing this might be instability in the model, but I'm not sure
how to get around it.

----------------------------------------------------------------------
--------

When in doubt, try going back to a reference source (
www.hongik.edu/~ym480/Shrout-Fleiss-ICC.pdf ) and manually computing
the ICC.  According to the source, ?ICC is the correlation between
one measurement . . . on a target and another measurement obtained on
that target.?  In your case, targets are smartphone software.

By the way, Rater #4 is providing valuable information about rater
reliability, and so I recommend against eliminating her scores from
the ICC computation.  

Just by inspection, raters are not reliable--if your sample is
representative, then a quarter of the population of raters disagrees
dramatically from the rest; even excluding this fraction, the ICC is
less than 60%.  Moreover, none of the raters? scores covers anywhere
near the dynamic range you and your colleagues have allocated for the
measurement.

My take on all that would be that your volunteers need better
training on evaluating smartphone software in the manner that you
want it done.  Perhaps you and your colleagues could provide more
explicit instructions on what you?re are looking for in measuring the
characteristic(s) of the software that you?re trying to measure.  

Joseph Coveney

version 11.2

clear *
set more off

input byte(Application Rater Score rank)
5 1 2 1
7 1 5 2
2 1 6 3
9 1 6 3
11 1 7 4
6 1 7 4
8 1 11 5
3 1 13 6
4 1 16 7
10 1 17 8
1 1 18 9
6 2 1 1
5 2 2 2
11 2 3 3
7 2 3 3
4 2 5 4
1 2 7 5
8 2 8 6
2 2 9 7
3 2 10 8
10 2 12 9
9 2 12 9
5 3 2 1
2 3 5 2
7 3 6 3
6 3 6 3
9 3 6 3
11 3 7 4
8 3 11 5
3 3 13 6
4 3 15 7
10 3 16 8
1 3 17 9
7 4 0 1
1 4 1 2
9 4 1 2
6 4 1 2
8 4 1 2
4 4 1 2
5 4 1 2
3 4 1 2
11 4 1 2
2 4 2 3
10 4 3 4
end

program define icc21
    version 11.2
    syntax varlist [if]
    
    quietly anova `varlist' `if'
    tempname BMS JMS EMS k n ICC
    scalar define `BMS' = e(ss_2) / e(df_2)
    scalar define `JMS' = e(ss_1) / e(df_1)
    scalar define `EMS' = e(rss) / e(df_r)
    scalar define `k' = e(df_1) + 1
    scalar define `n' = e(df_2) + 1
    scalar define `ICC' = (`BMS' - `EMS') / ///
        (`BMS' + (`k' - 1) * `EMS' + (`k' * (`JMS' - `EMS') / `n'))
    display in smcl as text "ICC Type 2, single rater"
    display in smcl as text "ICC(2, 1) = " `ICC'
end

program define iccem
    version 11.2
    syntax

    tempname sigma2_judge sigma2_target sigma2_residual ICC
    scalar define `sigma2_target' = exp(_b[lns1_1_1:_cons])^2
    scalar define `sigma2_judge' = exp(_b[lns1_2_1:_cons])^2
    scalar define `sigma2_residual' = exp(_b[lnsig_e:_cons])^2
    scalar define `ICC' = `sigma2_target' / ///
        (`sigma2_target' + `sigma2_judge' + `sigma2_residual')
    display in smcl as text "ICC Type 2, single rater"
    display in smcl as text "ICC(2, 1) = " `ICC'
end

*
* Within-rater rank-transformed scores
*
xtmixed rank || _all:R.Application || _all:R.Rater, ///
    reml nolrtest nostderr variance nolog
iccem

icc21 rank Rater Application

xtmixed rank if Rater != 4 || _all:R.Application || _all:R.Rater, ///
    reml nolrtest nostderr variance nolog
iccem

icc21 rank Rater Application if Rater != 4 

*
* Original scores
*
xtmixed Score || _all:R.Application || _all:R.Rater, ///
    reml nolrtest nostderr variance nolog
iccem

icc21 Score Rater Application

xtmixed Score if Rater != 4 || _all:R.Application || _all:R.Rater, ///
    reml nolrtest nostderr variance nolog
iccem

icc21 Score Rater Application if Rater != 4 

exit



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: reliability with -icc- and -estat icc-
  - From: "JVerkuilen (Gmail)" <[email protected]>

Prev by Date: Re: st: inserting new observations between two consecutive observations
Next by Date: Re: st: cluster
Previous by thread: Re: st: reliability with -icc- and -estat icc-
Next by thread: Re: st: reliability with -icc- and -estat icc-
Index(es):
- Date
- Thread