Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: RE: st: C-statistic with -gologit2-


From   Sara Muller <s.muller@cphc.keele.ac.uk>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: RE: st: C-statistic with -gologit2-
Date   Thu, 08 Oct 2009 08:41:39 +0100

Thanks to Roger and Richard for their helpful contributions here. I will try partitioning the data and see what I get.
Best wishes,
Sara

Newson, Roger B wrote:
Yes, you would have to calculate separate c-statistics with -mlogit-, as you describe. And these would have to be restricted to the 2 groups being compared, in order to make sense.

In the case of mlogit, you could also calculate multiple c-statistics, one for each partition of the outcome values. Alternatively, you could presumably define one stratum for each partition of the outcome variable, expand each subject into a cluster of observations (1 observation per subject per partition), define X for each subject-partition as a binary indicator of that subject's membership of the higher group in that partition, define Y for each subject-partition as the linear predictor for that subject of membership of the upper group in that partition, and define the summary c-statistic as the Harrell's c of Y with respect to X, stratified by partition. As in:

somersd X Y, cluster(subject) wstrata(partition) transf(c) tdist

where -subject- is the subject ID for each subject-partition, -partition- is the partition variable for each subject-partition, and Y and Y are as defined above. The c-statistic would then summarize the general ability of the linear predictors (as stored in Y) to predict the membership of upper groups of partitions (as stored in X), restricted to comparisons involving the same linear predictor for the same partition.

I hope this helps.

Best wishes

Roger


Roger B Newson BSc MSc DPhil
Lecturer in Medical Statistics
Respiratory Epidemiology and Public Health Group
National Heart and Lung Institute
Imperial College London
Royal Brompton Campus
Room 33, Emmanuel Kaye Building
1B Manresa Road
London SW3 6LR
UNITED KINGDOM
Tel: +44 (0)20 7352 8121 ext 3381
Fax: +44 (0)20 7351 8322
Email: r.newson@imperial.ac.uk Web page: http://www.imperial.ac.uk/nhli/r.newson/
Departmental Web page:
http://www1.imperial.ac.uk/medicine/about/divisions/nhli/respiration/popgenetics/reph/

Opinions expressed are those of the author, not of the institution.

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Richard Williams
Sent: 07 October 2009 20:59
To: statalist@hsphsun2.harvard.edu; 'statalist@hsphsun2.harvard.edu'
Subject: Re: st: RE: st: C-statistic with -gologit2-

At 01:30 PM 10/7/2009, Newson, Roger B wrote:
In the case of ordinal regression, instead of using the predicted probability, you should use the linear predictor, computed using -predict- with the -xb- option. This linear predictor is an ordinal predictor of the outcome. It then makes sense to use the c-statistic, although the confidence intervals should only be taken seriously if calculated (using out-of-sample prediction) in a different dataset from the dataset in which the ordinal model was fitted.

Thanks Roger. This won't work with gologit2, because there are multiple equations and hence multiple XBs. gologit2 is like mlogit in that respect.

In the case of mlogit, there are multiple linear predictors, interpreted as the log odds ratios (per X-unit) of the various non-baseline outcomes compared to the baseline outcome. In that case, the c-statistic for the linear predictor for each non-baseline outcome only makes sense if restricted to observations with either that non-baseline outcome or the baseline outcome.

So, does that mean you would compute separate C statistics only using groups 1 and 2, then 1 and 3, then 1 and 4 (assuming group 1 is the baseline and there are 4 groups).

gologit2 doesn't quite fit into this scheme either. gologit2 is like a series of binary logistic regressions with different dichotomizations of the original ordinal variable. First, it is group 1 versus groups 2, 3, 4; then groups 1 and 2 versus groups 3 and 4; then groups 1, 2 and 3 versus 4. If proportional odds holds each dichotomization produces the same coefficients except for the intercepts. I am not sure how the C statistic fits in with such a scheme; perhaps, in the above you would have 3 different C statistics?


-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME:   (574)289-5227
EMAIL:  Richard.A.Williams.5@ND.Edu
WWW:    http://www.nd.edu/~rwilliam

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

--
Sara Muller	
Research Associate: Biostatistics
Arthritis Research Campaign National Primary Care Centre
Primary Care Sciences
Keele University
Staffordshire, ST5 5BG
Tel:  	+44 (0) 1782 734853
Fax:  	+44 (0) 1782 733911
Email:	s.muller@cphc.keele.ac.uk

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index