[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: Quantile question

From	Maarten buis <[email protected]>
To	stata list <[email protected]>
Subject	Re: st: RE: Quantile question
Date	Mon, 3 Mar 2008 14:42:36 +0000 (GMT)

--- Ronan Conroy wrote:
> I was involved in looking at the effect of a number of measures of
> the patient's response to illness as predictors of depression - work
> and social adjustment, symptom frequency, symptom bother, and so on.
> These were measured using standardised scales, but each scale had
> its own theoretical range, its own empirical score distribution and,
> most important, each scale was measured in arbitrary units.
>
> It was useful for the reader to be able to see the odds ratios (and  
> confidence intervals) associated with a 1-decile increase in each of 

> these predictors, as it gave them a way of comparing their effects  
> and of judging their practical importance as well as their  
> statistical significance.
>
> There are times, then, when quantiles do not lose information but  
> increase it, by converting unfamiliar and arbitrary measurement  
> scales to a definable measurement unit.

I think the issue is terminology here. Say we have a variable with an
arbitrary unit, say, symptom bother. The the 9th percentile is the
value of the arbitrarily scaled variable sympton bother which has 9% of
the observations below it. (http://en.wikipedia.org/wiki/Quantile) 

What you seem to mean is the percentile rank, which in the example
above would be the number 9, and is a useful way of standardizing
variables: you can create a new variable with now range (0 to 100),
mean (50), standard deviation (approx. 28.6, depending on the number of
knots) and distribution (uniform distribution).

If that is the case than I agree with you. The more common alternative
(in my discipline anyhow) is z-scores (the variable minus the mean
devided by the standard deviation), and is often implicitly interpreted
in terms of these percentile ranks: a value of 1.96 is large because if
the variable is from a normal (Gaussian) distribution, 97.5 % of the
respondents will have a value less than that. Notice that you now need
the additional assumption about the distribution, which percentile
scores don't make. 

However, Nick is right when he points out that this is a non-linear
transformation. In particular, you will only retain information about
the ordering of values and loose information about the distances
between values.

-- Maarten

-----------------------------------------
Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands

visiting address:
Buitenveldertselaan 3 (Metropolitan), room Z434

+31 20 5986715

http://home.fsw.vu.nl/m.buis/
-----------------------------------------

      __________________________________________________________
Sent from Yahoo! Mail.
A Smarter Inbox. http://uk.docs.yahoo.com/nowyoucan.html
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: RIF: st: chow test for more than 2 coef
Next by Date: RE: st: evaluating probabilities using kdensity
Previous by thread: Re: st: RE: Quantile question
Next by thread: RE: st: -reshape- with more than one j()?
Index(es):
- Date
- Thread