Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <njcoxstata@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Is it valid to use the individual ratios (i.e. Xi/Yi) in the dependent or independent part of a regression model? |

Date |
Sat, 26 May 2012 16:20:48 +0100 |

The paper you cite is for a very specific problem with a very specific generating process. That is the nub of the matter. You need to specify what you are estimating with what model. The best way to approach this is probably through simulation of what happens with sample size of concern to you and plausible assumptions. There are many takes on this, however. For example, if a ratio is badly behaved, then the best way to analyse data may be not to use ratios. If a goal is misconceived, establishing which lousy method of attempting that goal is least bad is not a good question. These are banal generalities. One implication is that is you may need to disclose more specific details about what you want to do to get better advice. Nick On Sat, May 26, 2012 at 3:31 PM, <guhjy@kmu.edu.tw> wrote: > It it true that "ratio of means" is less biased than "mean of ratios" > (Comparing Ratio Estimators Based on Systematic Samples: > http://www.isrt.ac.bd/sites/default/files/jsrissues/v40n2/v40n2p1.pdf)? 2012/5/26 Tirthankar Chakravarty <tirthankar.chakravarty@gmail.com>: >> They estimate two different quantities - you decide which one you want: >> >> ******************************************* >> webuse census2, clear >> >> // ratio of means >> ratio (deathrate: death/pop) >> * or, more transparently >> mean death pop >> di _b[death]/_b[pop] >> >> // mean of ratio >> g deathrate = death/pop >> reg deathrate >> * or, more transparently >> mean deathrate >> ******************************************* On Sat, May 26, 2012 at 12:19 AM, <guhjy@kmu.edu.tw> wrote: >>> My point is that the mean and se are different between that obtained >>> by the "ratio" (which is supposedly to be more accurate) and the >>> "regress" command. Thus, the results obtained by the "regress" command >>> may be invalid. My question is: how to analyze ratios as the dependent >>> or independent variables in regression if the mean and se of (Xi/Yi) >>> is incorrect. >>> For example: >>> >>> . webuse census2, clear >>> (1980 Census data by state) >>> >>> . >>> . gen drate1=death/pop >>> >>> . >>> . reg drate1 >>> >>> Source | SS df MS Number of obs = 50 >>> -------------+------------------------------ F( 0, 49) = 0.00 >>> Model | 0 0 . Prob > F = . >>> Residual | .000083179 49 1.6975e-06 R-squared = 0.0000 >>> -------------+------------------------------ Adj R-squared = 0.0000 >>> Total | .000083179 49 1.6975e-06 Root MSE = .0013 >>> >>> ------------------------------------------------------------------------------ >>> drate1 | Coef. Std. Err. t P>|t| [95% Conf. Interval] >>> -------------+---------------------------------------------------------------- >>> _cons | .008436 .0001843 45.78 0.000 .0080657 .0088063 >>> ------------------------------------------------------------------------------ >>> >>> . >>> . ratio (deathrate: death/pop) >>> >>> Ratio estimation Number of obs = 50 >>> >>> deathrate: death/pop >>> >>> -------------------------------------------------------------- >>> | Linearized >>> | Ratio Std. Err. [95% Conf. Interval] >>> -------------+------------------------------------------------ >>> deathrate | .0087368 .0002052 .0083244 .0091492 >>> -------------------------------------------------------------- 2012/5/26 Steve Samuels <sjsamuels@gmail.com>: >>>> Rich Goldstein's nice summary contains a reference to Dick Kronmal's article: >>>> >>>> Kronmal, R. A. (1993). Spurious correlation and the fallacy of the ratio standard >>>> revisited. Journal of the Royal Statistical Society. Series A (Statistics in >>>> Society), 379-392. >>>> >>>> Dick's thinking (and title) were inspired by: >>>> >>>> Tanner, J. M. (1949). Fallacy of per-weight and per-surface area standards, >>>> and their relation to spurious correlation. Journal of Applied Physiology, 2(1), 1-15. >>>> >>>> Happily, Tanner's article is available online: >>>> >>>> http://0-jap.physiology.org.library.pcc.edu/content/2/1/1.full.pdf+html Nick Cox >>>> Your opening statement is more nearly incorrect than correct. In >>>> general, X / Y is indeterminate whenever Y is 0; if X and Y are >>>> normally distributed that is an event with probability 0 (which still >>>> means possible) but the ratio is otherwise well defined. >>>> >>>> If Y is ever 0 in your data then the ratio X / Y is unlikely to make >>>> scientific sense and so the question of what you can and can't do with >>>> it statistically doesn't really arise. >>>> >>>> I don't think there is a simple answer to whether you should use >>>> ratios in regression. Often it is scientifically natural; often it is >>>> pretty dangerous. >>>> >>>> For one statement of various pitfalls see list member RIchard >>>> Goldstein on ratios: >>>> >>>> http://biostat.mc.vanderbilt.edu/wiki/pub/Main/BioMod/goldstein.ratios.pdf >>>> >>>> Better advice might depend on your giving more details on what you >>>> want to, mentioning the scientific or medical context as well. On Fri, May 25, 2012 at 5:36 AM, <guhjy@kmu.edu.tw> wrote: >>>>> The ratio of two normally distributed variables (X and Y) has no mean >>>>> or variance. >>>>> 1. Why is it valid that the "ratio" command estimates the mean and se of ratios? >>>>> 2. Is it valid to use the individual ratios (i.e. Xi/Yi) in the >>>>> dependent or independent part of a regression model? * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:

**References**:**st: Is it valid to use the individual ratios (i.e. Xi/Yi) in the dependent or independent part of a regression model?***From:*guhjy@kmu.edu.tw

**Re: st: Is it valid to use the individual ratios (i.e. Xi/Yi) in the dependent or independent part of a regression model?***From:*Nick Cox <njcoxstata@gmail.com>

**Re: st: Is it valid to use the individual ratios (i.e. Xi/Yi) in the dependent or independent part of a regression model?***From:*Steve Samuels <sjsamuels@gmail.com>

*From:*guhjy@kmu.edu.tw

*From:*Tirthankar Chakravarty <tirthankar.chakravarty@gmail.com>

*From:*guhjy@kmu.edu.tw

- Prev by Date:
**Re: st: ivreg2** - Next by Date:
**Re: st: minor queries about new official Stata commands -icc- & -estat icc-** - Previous by thread:
- Next by thread:
- Index(es):