[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: resistant line or median median line |

Date |
Wed, 9 Feb 2005 13:34:20 -0000 |

There are several slightly different recipes for this line. Tukey used similar ideas around the time of his Exploratory data analysis (1977), and there is an older literature going back at least to the 1940s. The key point of most of the recipes I have seen is that they are amenable to hand calculation, insofar as the x and y medians of each group can be determined by eye on a scatter plot for modest sample sizes. So in a sense I think it's arguable that the method has been superseded by quantile regression. It is indeed not (guaranteed to be) exactly the same as quantile regression. (However, is it true that a quantile regression necessarily passes through (median of x, median of y)? I doubt it.) I am not aware of a Stata implementation. Still, it is possible to make a hack at one. *! NJC 1.0.0 9 February 2005 program resline version 8 syntax varlist(min=2 max=2) [if] [in] [, * ] quietly { marksample touse count if `touse' if r(N) == 0 error 2000 tokenize `varlist' args y x tempvar cut egen `cut' = cut(`x') if `touse', group(3) su `y' if `cut' == 0, detail local y0 = r(p50) su `y' if `cut' == 1, detail local y1 = r(p50) su `y' if `cut' == 2, detail local y2 = r(p50) su `x' if `cut' == 0, detail local x0 = r(p50) su `x' if `cut' == 1, detail local x1 = r(p50) su `x' if `cut' == 2, detail local x2 = r(p50) local slope = ((`y2') - (`y0')) / ((`x2') - (`x0')) if `slope' == . { di as err "no go: slope indeterminate" exit 498 } local intercept = ((`y2') + (`y1') + (`y0')) / 3 if `intercept' == . { di as err "no go: intercept indeterminate" exit 498 } } di di as txt "slope" "{col 12}" as res %12.3f `slope' local b : di %4.3f `slope' di as txt "y summary" "{col 12}" as res %12.3f `intercept' local a : di %4.3f `intercept' local X1 : di %4.3f `x1' twoway function resistant = /// `intercept' + `slope' * (x - `x1'), /// range(`x') t1(`y' = `a' + `b' * (`x' - `x1')) /// || scatter `y' `x' if `touse', `options' end e.g. resline mpg weight Nick n.j.cox@durham.ac.uk Faith Anne > I need to calculate a specific type of line through a two-variable > dataset. In exploratory data analysis, what I need is called a > resistant line. In my high school classes, we called it a > median-median line. The way it's calculated is to divide the data into > three groups, find the x-median and y-median values (called the > summary point) for each group, and then use those three summary points > to determine the line. The outer two summary points determine the > slope, and an average of all of them determines the intercept. > > As far as I can tell, this isn't quite the same as the quantile > regression command, because the resistant line doesn't necessarily go > through the median of the whole dataset. In the resistant line > calculation, you ignore all information besides the summary points, so > you don't actually take into account the absolute deviations and try > to minimize them. Someone please correct me if I have misunderstood > this! > > I'm aware of the pros and cons of this method as compared to least > squares linear regression, but I am required to do this analysis and > compare it to least squares. Minitab can do this through its menu of > EDA commands, but I'm deeply frustrated with Minitab's data management > and graphing, so I'd really like to know how to do this with Stata. * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**st: "dyadic" data** - Next by Date:
**st: RE: "dyadic" data** - Previous by thread:
**st: "dyadic" data** - Next by thread:
**st: RE: "dyadic" data** - Index(es):

© Copyright 1996–2017 StataCorp LLC | Terms of use | Privacy | Contact us | What's new | Site index |