Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Is it valid to use the individual ratios (i.e. Xi/Yi) in the dependent or independent part of a regression model?

From   David Hoaglin <>
Subject   Re: st: Is it valid to use the individual ratios (i.e. Xi/Yi) in the dependent or independent part of a regression model?
Date   Sun, 27 May 2012 07:35:51 -0400

Dear Jinn-Yuh,

My answer is, "It depends."

In an earlier message, you explained that ACR is used to standardize
urinary concentration (of urinary albumin, I think) to ensure
comparability of albuminuria among individual
patients.  "Standardize" may be a bit too strong; it may be that
dividing by urinary creatinine merely adjusts for variation among

If ACR is the variable that clinicians work with, you can definitely
use ACR as either the dependent variable or an explanatory variable.

Sometimes it is preferable to work with concentration data in a log
scale (either explicitly or by leaning on the -poisson- command to use
quasi-likelihood to fit a linear predictor in the log scale without
transforming the data --- the latter approach is a separate
discussion, and I won't pursue it here).

One can use regression for a variety of purposes.  You may be
interested mainly in prediction, or in the values of one of the
coefficients in the regression (for example, how ACR varies with
cholesterol when you adjust for the contributions of age and gender).
(These two do not exhaust the list of purposes.)  A regression model
for either of these purposes could have ACR as the dependent variable.
 Depending on the research that led to the use (adoption?) of ACR, it
might also be instructive to use urinary albumin as the dependent
variable and urinary creatinine as one of the explanatory variables.
I could also see working with log(ACR) and with log(urinary albumin)
and log(urinary creatinine) in parallel analyses.

I'm not familiar with the physiology, so I don't know whether it is
meaningful to have ACR as the dependent variable and cholesterol as an
explanatory variable and also to have cholesterol as the dependent
variable and ACR (or urinary albumin and urinary creatinine) as an
explanatory variable.  As an explanatory variable, ACR is one function
of urinary albumin and urinary creatinine; but you could reasonably
consider other functions, such as the linear combination of urinary
albumin and urinary creatinine that arises from using those two as
explanatory variables or the nonlinear function in which the
explanatory variables in that part of the model are urinary albumin,
urinary creatinine, and their product (for this version, it would be a
good idea to center the two variables by subtracting suitable values
before taking their product).

I have focused mainly on model building.  That is probably the main
issue.  Fortunately, you have enough data (about 500 patients) to
develop a reasonable model.  You may have been looking for a
straightforward answer, and I have given you a rather complicated one.
 In practice, careful analyses of data are seldom simple.  In this
instance, if you are not already familiar with regression diagnostics,
it would be worthwhile to learn about them.  They should be helpful as
you proceed with the analysis of your data.

David Hoaglin

On Sun, May 27, 2012 at 5:07 AM,  <> wrote:
> Dear David:
> In a sample (not a survey sample) of about 500 hospital chronic kidney
> disease patients, I am using ACR as the:
> 1. Dependent variable: regress ACR age gender cholesterol (Is it
> better to regress urinary albumin on urinary creatinine, age, gender
> and cholesterol?)
> 2. Independent variable: regress cholesterol age gender ACR (Is it
> better to regress cholesterol on age, gender, urinary albumin and
> urinary creatinine?)
>  "Patients with chronic kidney disease" is the population in the
> inferential statistics. The population ACR (but not the population
> totals of urinary albumin or urinary creatinine) are my concerns.
> Thank you.
> Jinn-Yuh
> 2012/5/27 David Hoaglin <>:
>> Dear Jinn-Yuh,
>> In a notation that is customary in survey sampling, X/Y (perhaps more
>> commonly Y/X) is the ratio of two population totals.  Please tell us
>> more about the population for which you would like to estimate the
>> ratio of the population total of urinary albumin to the population
>> total of urinary creatinine.
>> If you are calculating ACR for individual patients, and that is the
>> variable that you are using in your regressions, how are the
>> population totals related to those regressions?  The relevance of the
>> biases that you have mentioned to your analysis is not yet clear.  It
>> would help if you described one of the multiple regression models that
>> you are using.
>> David Hoaglin
>> On Sat, May 26, 2012 at 9:02 PM,  <> wrote:
>>> ACR (urinary albumin creatinine ratio, i.e. urinary albumin [Xi]
>>> divided by urinary creatinine [Yi]) is used to standardize for urinary
>>> concentration to ensure comparability of albuminuria among individual
>>> patients ( I am using
>>> ACR as the dependent or independent variable in multiple linear
>>> regressions. However, "ratio of means" and "mean of ratios (ACR
>>> [Xi/Yi] in this case)" are both biased estimates for the population
>>> ratio [X/Y] (Mean of ratios or ratio of means or both?:
>>> In view of these problems and the many pitfalls of ratios mentioned in
>>> many references, is it better to use X (or Y) to adjust for Y (or X)
>>> in regressions (despite its clinical usefulness in individual
>>> decisions)?
>>> Thank you.
>>> Jinn-Yuh
>> *
>> *   For searches and help try:
>> *
>> *
>> *
> *
> *   For searches and help try:
> *
> *
> *

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index