Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Nick Cox <njcoxstata@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: Quantile regression |
Date | Sat, 22 Sep 2012 10:52:20 +0100 |
I will number your commands for ease of discussion. 1. xtile bmi_q = bmi, nquantiles(4) 2. bysort bmi_q sex:sum glucose, detail 3. bysort sex: anova glucose_log bmi_q 4. bysort sex: qreg bmi glucose age #2 gives descriptive statistics, which no doubt could be useful. I would expect graphs to be as or more useful, e.g. scatter glucose bmi || lowess glucose bmi, by(sex) #1 and #3 are choices that seem very hard to defend in any statistical discussion. You are throwing away information on variability within quartile groups of -bmi- and degrading the data. #4 is puzzling too. Why expect a linear relation between -bmi- and its predictors? If there are different relationships according to -sex-, the most usual tactic is not to fit separate models, but to fit a joint model with interactions between age and sex. If -glucose- is the response, it should not be the predictor in #4. Why is glucose treated as linear in one model and logged in another? This is not my field, but I find it difficult to imagine that the science _demands_ thinking in terms of quartiles. Quartiles are a best a convenient categorisation and at worst an arbitrary and inefficient one. Identifiying a best predictor is never easy and often futile. Nick On Sat, Sep 22, 2012 at 9:05 AM, Vasan Kandaswamy <vasan.kandaswamy@ki.se> wrote: > Thank you very much. I sincerely apologize for not having made my question clear. > > The scientific question that I would like to address are: > 1. How much fold increase in outcome variable ( glucose) is observed from Quartile 1 to Quartile 4 of predictor variable (BMI) and want to see if this difference across quartiles is significant. > 2. How much is the unit change observed in outcome variable. > 3. With various predictors ( BMI, waist, body fat, weight etc) , I want to see which one best predicts the outcome variable > 4. All analysis I would like to see seperately for men and women > > To address these : I went about this way > 1. derived mean/median of outcome variable in each quartile > 2. To compare the mean of glucose across quartiles of BMI for males ( not compare male mean and female mean in each quartile)- I intend to do an one way ANOVA ( but was suggested a two way) > 3. To observe the unit change across quartiles, I wanted to do a regression model using qreg. > 4. Finally, I am not sure as to how to go about with finding out which is the best predictor of the outcome. ( If I am not mistaken, I do not think I can do a standardized beta in qreg). > > The script I used are > xtile bmi_q = bmi, nquantiles(4) > bysort bmi_q sex:sum glucose, detail > bysort sex: anova glucose_log bmi_q > bysort sex: qreg bmi glucose age > > I hope I have made it more understandable now. > Would be really very useful if I have your suggestions on these. David Hoaglin [dchoaglin@gmail.com] > I'm puzzled. From the way in which you described your analysis in > your first message, I don't understand why you would use quantile > regression. As I recall, you wanted to compare the means of some > variables across quartiles of BMI for males and females. In that > description, it was not clear to me whether you wanted to compare the > mean of a variable in data from males among the quartiles of BMI and > similarly in data from females, or whether you wanted to compare the > female mean and the male mean within each quartile of BMI, or whether > you wanted to make both of these types of comparisons. I did not see > any mention of the numbers of observations or the source of the data > or, importantly, the scientific question that you are addressing. > > As I read the command below, you are asking -qreg- the fit a > regression model to the median of BMI with predictors fast_glucose, > etc. (the median is the default quantile in -qreg-). This seems far > from what you set out to do. > > Those of us who are following this thread would be better able to > advise you if you went back to the beginning and gave us more > information on the data and the context. I do not know, for example, > whether the data that you are analyzing are suitable for ANOVA. They > may be (perhaps after a transformation), and you may have given up on > ANOVA too quickly. > On Wed, Sep 19, 2012 at 5:33 PM, Vasan Kandaswamy >> Now, I have given up on ANOVA since I cannot derive p values for gender seperately, but did a regression. >> >> A quantile regression this way comes up this way >> bysort bmi_q sex:sum g0mmol >> bysort sex: qreg bmi fast_glucose age pr ( adjusted for age) >> >> I tabulate the output this way >> BMI Q1 Q2 Q3 Q4 Beta (95%CI) P value >> Male 5.3 5.4 5.5 5.6 2.61 (1.46, 3.76) 8.91 x 10^-06 >> Female 5.4 5.4 5.4 5.7 0.36 (-0.15, 0.86) 0.168 >> >> IF you actually look at the mean glucose values in Q1-Q5, there is not much difference, but the regression shows a clear difference with p values of males significant, while females are not. >> >> Could you please explain of my approach is correct. >> The basic question I would like to ask is if the fold change from Q1 to Q5 is significant. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/