Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Quantile regression

From   Vasan Kandaswamy <>
To   "" <>
Subject   RE: st: Quantile regression
Date   Sat, 22 Sep 2012 10:55:19 +0000

Thank you very much Nick for you response. 
They are very useful in taking the analysis forward. Is there any way that variability within quartile groups be addressed ?
Best regards,

From: [] on behalf of Nick Cox []
Sent: Saturday, September 22, 2012 11:52 AM
Subject: Re: st: Quantile regression

I will number your commands for ease of discussion.

1. xtile bmi_q = bmi, nquantiles(4)

2. bysort bmi_q sex:sum glucose, detail

3. bysort sex: anova glucose_log bmi_q

4. bysort sex: qreg bmi glucose age

#2 gives descriptive statistics, which no doubt could be useful. I
would expect graphs to be as or more useful, e.g.

scatter glucose bmi || lowess glucose bmi, by(sex)

#1 and #3 are choices that seem very hard to defend in any statistical
discussion. You are throwing away information on variability within
quartile groups of -bmi- and degrading the data.

#4 is puzzling too. Why expect a linear relation between -bmi- and its
predictors? If  there are different relationships according to -sex-,
the most usual tactic is not to fit separate models, but to fit a
joint model with interactions between age and sex.

If -glucose- is the response, it should not be the predictor in #4.

Why is glucose treated as linear in one model and logged in another?

This is not my field, but I find it difficult to imagine that the
science _demands_ thinking in terms of quartiles. Quartiles are a best
a convenient categorisation and at worst an arbitrary and inefficient

Identifiying a best predictor is never easy and often futile.


On Sat, Sep 22, 2012 at 9:05 AM, Vasan Kandaswamy
<> wrote:

> Thank you very much. I sincerely apologize for not having made my question clear.
> The scientific question that I would like to address are:
> 1. How much fold increase in outcome variable ( glucose) is observed from Quartile 1 to Quartile 4 of predictor variable (BMI) and want to see if this difference across quartiles is significant.
> 2. How much is the unit change observed in outcome variable.
> 3. With various predictors ( BMI, waist, body fat, weight etc) , I want to see which one best predicts the outcome variable
> 4. All analysis I would like to see seperately for men and women
> To address these : I went about this way
> 1. derived mean/median of outcome variable in each quartile
> 2. To compare the mean of glucose across quartiles of BMI for males ( not compare male mean and female mean in each quartile)- I intend to do an one way ANOVA ( but was suggested a two way)
> 3. To observe the unit change across quartiles, I wanted to do a regression model using qreg.
> 4. Finally, I am not sure as to how to go about with finding out which is the best predictor of the outcome. ( If I am not mistaken, I do not think I can do a standardized beta in qreg).
> The script I used are
> xtile bmi_q = bmi, nquantiles(4)
> bysort bmi_q sex:sum glucose, detail
> bysort sex: anova glucose_log bmi_q
> bysort sex: qreg bmi glucose age
> I hope I have made it more understandable now.
> Would be really very useful if I have your suggestions on these.

David Hoaglin []

> I'm puzzled.  From the way in which you described your analysis in
> your first message, I don't understand why you would use quantile
> regression.  As I recall, you wanted to compare the means of some
> variables across quartiles of BMI for males and females.  In that
> description, it was not clear to me whether you wanted to compare the
> mean of a variable in data from males among the quartiles of BMI and
> similarly in data from females, or whether you wanted to compare the
> female mean and the male mean within each quartile of BMI, or whether
> you wanted to make both of these types of comparisons.  I did not see
> any mention of the numbers of observations or the source of the data
> or, importantly, the scientific question that you are addressing.
> As I read the command below, you are asking -qreg- the fit a
> regression model to the median of BMI with predictors fast_glucose,
> etc. (the median is the default quantile in -qreg-).  This seems far
> from what you set out to do.
> Those of us who are following this thread would be better able to
> advise you if you went back to the beginning and gave us more
> information on the data and the context.  I do not know, for example,
> whether the data that you are analyzing are suitable for ANOVA.  They
> may be (perhaps after a transformation), and you may have given up on
> ANOVA too quickly.

> On Wed, Sep 19, 2012 at 5:33 PM, Vasan Kandaswamy

>> Now, I have given up on ANOVA since I cannot derive p values for gender seperately, but did a regression.
>> A quantile regression this way comes up this way
>> bysort bmi_q sex:sum g0mmol
>> bysort sex: qreg bmi fast_glucose age pr ( adjusted for age)
>> I tabulate the output this way
>> BMI                Q1      Q2        Q3        Q4     Beta (95%CI)            P value
>> Male              5.3     5.4        5.5        5.6     2.61 (1.46, 3.76)     8.91 x 10^-06
>> Female         5.4      5.4       5.4         5.7    0.36 (-0.15, 0.86)     0.168
>> IF you actually look at the mean glucose values in Q1-Q5, there is not much difference, but the regression shows a clear difference with p values of males significant, while females are not.
>> Could you please explain of my approach is correct.
>> The basic question I would like to ask is if the fold change from Q1 to Q5 is significant.
*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2016 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index