[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
Roger Newson <roger.newson@kcl.ac.uk> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Can Y be a predicted variable? |

Date |
Fri, 09 Sep 2005 19:53:56 +0100 |

At 18:31 09/09/2005, Tina wrote:

In general, categorical ordinal outcomes like SAHS are a problem, especially if you want to have a parameter estimate that can be understood by non-statisticians.Dear statalisters, I have a dependent variable in 5 levels (Self-Assessed Health Status from very good to very poor). I am currently assuming a latent continuous variable, but that is problematic for some of my analysis. I have some other measures of health in my data and was wondering if it was appropriate to create a new one that would be continuous. My suggestion would be: 1. regress SAHS on other health variables. 2. Predict SAHS (lets call it SAHShat) based on the previous regression. 3. The new measure would be calculated as an average of SAHS and SAHShat This looks like a good idea to me, but I wonder why I don't see anyone else doing this if it is OK. Those of you that fell of your office chairs in laughter could maybe get back on and explain why not, because it seems fine idea to me right now.

A possible solution is to use Somers' D, which can be estimated (with confidence limits) using the -somersd- package (downloadable from SSC using the -ssc- command). Somers' D is defined in terms of Kendall's tau-a, which is defined as

tau(X,Y) = E[sign(X1-X2)sign(Y1-Y2)]

where (X1,Y1) and (X2,Y2) are sampled independently from the same population. Somers' D is defined as

D(Y|X) = tau(X,Y)/tau(X,X)

Therefore, Kendall's tau-a is the difference between 2 probabilities, namely the probability that the larger of 2 randomly-sampled X-values is associated with the larger of the 2 corresponding Y-values and the probability that the larger of the 2 X-values is associated with the smaller of the 2 Y-values. Somers' D is the difference between the 2 corresponding conditional probabilities, given that the 2 X-values are not equal. Somers' D and Kendall's tau-a are discussed in the manual -somersd.pdf-, distributed on SSC with the -somersd- package, and also in Newson (2002).

Tina does not mention the proposed predictor variables in the proposed regression model. However, in a multivariate regression model, there is usually one predictor X that is really interesting and other predictors that are confounders. For instance, we might want to know how daily cigarette consumption predicts SAHS, adjusting for confounders such as income, access to a car. and other indicators of general standard of living. To estimate a Somers' D of SAHS with respect to cigarettes adjusted for the confounders, the first step is to define a propensity score for cigarette consumption by regressing cigarette consumption with respect to confounders and using the predicted level of cigarette consumption (calculated using -predict-) as the cigarette propensity score. We can then use -xtile- to define a number of cigarette propensity groups from the propensity score, and use -somersd- with the -wstrata()- option to estimate a Somers' D of SAHS with respect to cigarette consumption stratified by cigarette propensity group. This Somers' D measures association between SAHS and cigarette consumption in pairs of patients in the same cigarette propensity group. If it is high, then we can say that higher cigarette consumers have poorer health than lower cigarette smokers with similar "cigarette propensity" based on the confounders. In other words, if the stratified Somers' D is high, then the poorer health of cigarette smokers is not caused by the fact that cigarette smokers are cigarette-prone because of their low general standard of living. Some references about propensity scores are given on the manual -somersd.pdf-.

I hope this helps.

Roger

References

Newson R. Parameters behind "nonparametric" statistics: Kendall's tau, Somers' D and median differences. The Stata Journal 2002; 2 (1): 45-64. Also downloadable from my website at http://phs.kcl.ac.uk/rogernewson/papers.htm

--

Roger Newson

Lecturer in Medical Statistics

Department of Public Health Sciences

Division of Asthma, Allergy and Lung Biology

King's College London

5th Floor, Capital House

42 Weston Street

London SE1 3QD

United Kingdom

Tel: 020 7848 6648 International +44 20 7848 6648

Fax: 020 7848 6620 International +44 20 7848 6620

or 020 7848 6605 International +44 20 7848 6605

Email: roger.newson@kcl.ac.uk

Website: http://phs.kcl.ac.uk/rogernewson/

Opinions expressed are those of the author, not the institution.

*

* For searches and help try:

* http://www.stata.com/support/faqs/res/findit.html

* http://www.stata.com/support/statalist/faq

* http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Can Y be a predicted variable?***From:*Tinna <statalist@gmail.com>

**References**:**st: Can Y be a predicted variable?***From:*TinnaLaufey Asgeirsdottir <statalist@gmail.com>

- Prev by Date:
**Re: st: svytotal - saved results in Stata 8.2** - Next by Date:
**st: how to constrain non linear least squares** - Previous by thread:
**st: Can Y be a predicted variable?** - Next by thread:
**Re: st: Can Y be a predicted variable?** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |