Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: RE: st: C-statistic with -gologit2-


From   "Newson, Roger B" <[email protected]>
To   "'[email protected]'" <[email protected]>
Subject   RE: st: RE: st: C-statistic with -gologit2-
Date   Wed, 7 Oct 2009 21:22:20 +0100

Yes, you would have to calculate separate c-statistics with -mlogit-, as you describe. And these would have to be restricted to the 2 groups being compared, in order to make sense.

In the case of mlogit, you could also calculate multiple c-statistics, one for each partition of the outcome values. Alternatively, you could presumably define one stratum for each partition of the outcome variable, expand each subject into a cluster of observations (1 observation per subject per partition), define X for each subject-partition as a binary indicator of that subject's membership of the higher group in that partition, define Y for each subject-partition as the linear predictor for that subject of membership of the upper group in that partition, and define the summary c-statistic as the Harrell's c of Y with respect to X, stratified by partition. As in:

somersd X Y, cluster(subject) wstrata(partition) transf(c) tdist

where -subject- is the subject ID for each subject-partition, -partition- is the partition variable for each subject-partition, and Y and Y are as defined above. The c-statistic would then summarize the general ability of the linear predictors (as stored in Y) to predict the membership of upper groups of partitions (as stored in X), restricted to comparisons involving the same linear predictor for the same partition.

I hope this helps.

Best wishes

Roger


Roger B Newson BSc MSc DPhil
Lecturer in Medical Statistics
Respiratory Epidemiology and Public Health Group
National Heart and Lung Institute
Imperial College London
Royal Brompton Campus
Room 33, Emmanuel Kaye Building
1B Manresa Road
London SW3 6LR
UNITED KINGDOM
Tel: +44 (0)20 7352 8121 ext 3381
Fax: +44 (0)20 7351 8322
Email: [email protected] 
Web page: http://www.imperial.ac.uk/nhli/r.newson/
Departmental Web page:
http://www1.imperial.ac.uk/medicine/about/divisions/nhli/respiration/popgenetics/reph/

Opinions expressed are those of the author, not of the institution.

-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Richard Williams
Sent: 07 October 2009 20:59
To: [email protected]; '[email protected]'
Subject: Re: st: RE: st: C-statistic with -gologit2-

At 01:30 PM 10/7/2009, Newson, Roger B wrote:
>In the case of ordinal regression, instead of using the predicted 
>probability, you should use the linear predictor, computed using 
>-predict- with the -xb- option. This linear predictor is an ordinal 
>predictor of the outcome. It then makes sense to use the 
>c-statistic, although the confidence intervals should only be taken 
>seriously if calculated (using out-of-sample prediction) in a 
>different dataset from the dataset in which the ordinal model was fitted.

Thanks Roger.  This won't work with gologit2, because there are 
multiple equations and hence multiple XBs.  gologit2 is like mlogit 
in that respect.

>In the case of mlogit, there are multiple linear predictors, 
>interpreted as the log odds ratios (per X-unit) of the various 
>non-baseline outcomes compared to the baseline outcome. In that 
>case, the c-statistic for the linear predictor for each non-baseline 
>outcome only makes sense if restricted to observations with either 
>that non-baseline outcome or the baseline outcome.

So, does that mean you would compute separate C statistics only using 
groups 1 and 2, then 1 and 3, then 1 and 4 (assuming group 1 is the 
baseline and there are 4 groups).

gologit2 doesn't quite fit into this scheme either.  gologit2 is like 
a series of binary logistic regressions with different 
dichotomizations of the original ordinal variable.  First, it is 
group 1 versus groups 2, 3, 4; then groups 1 and 2 versus groups 3 
and 4; then groups 1, 2 and 3 versus 4.  If proportional odds holds 
each dichotomization produces the same coefficients except for the 
intercepts.  I am not sure how the C statistic fits in with such a 
scheme; perhaps, in the above you would have 3 different C statistics?


-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME:   (574)289-5227
EMAIL:  [email protected]
WWW:    http://www.nd.edu/~rwilliam

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index