[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: Restricting range of values in a graph |

Date |
Thu, 16 Jul 2009 14:11:04 +0100 |

I almost totally agree with Steve's advice. He uses the word Winsorize a little more widely than is standard. (By the way, I can assure anyone who reads that FAQ that the misbegotten word "gotten" did not appear in my original draft.) I'd favour making the omission of outliers a little more evident. In this and some other respects -stripplot, box- is more flexible than -graph box- or -graph hbox-. -stripplot- is downloadable from SSC. Consider as an example -price- in the auto dataset. sysuse auto clonevar price2 = price replace price2 = 14000 if price2 > 14000 stripplot price2, over(foreign) box center stack width(250) /// xla(4000(2000)12000 14000 "outliers") gen outliers = price > 14000 stripplot price2, over(foreign) box center stack width(250) xla(4000(2000)12000 14000 "outliers") /// separate(outliers) ms(oh S) legend(off) Nick n.j.cox@durham.ac.uk sjsamuels@gmail.com Try the -nooutside- option or switch to another scale and show everything. See: Nick Cox's FAQ at http://www.stata.com/support/faqs/graphics/boxandlog.html . What he demonstrates can apply to scales other than the log. If you want to show some of the outside points, but not all, you will have to Winsorize the points you want to hide. Replace them with a value at the upper end of your desired graph range and give them an invisible marker symbol. This will leave the rest of the boxplot unchanged. You can add text at that value to show the number of higher points excluded. This problem comes up for other commands in which Stata computes the plotting points; -stcurve- is an example. Stata has a -range- option for axes, but it can only expand, not contract, the plotting range. On Thu, Jul 16, 2009 at 3:09 AM, Dana Chandler<dchandler@gmail.com> wrote: > I am preparing some graphs with simple boxplots over various groups. > Thus on my x-axis, I have categorical variables for population groups. > My y-axis has # of businesses of a certain type within each population > group. > > Unfortunately, I would like to be able to only show the y-axis within > a certain range (so as to not have outliers distort the picture). One > idea I had was to simply do the graph and add "IF #businesses < 50". > This will make the graph visible, but will distort the IQR of the > boxplot. The "yscale(r(0 25))" command does not seem to work and seems > only to "extend" a range of y-values rather than restrict it. Does > anyone have a suggestion for how to construct a graph for the entire > range of data but only display it over a specific range? * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Restricting range of values in a graph***From:*sjsamuels@gmail.com

**References**:**st: Restricting range of values in a graph***From:*Dana Chandler <dchandler@gmail.com>

**Re: st: Restricting range of values in a graph***From:*sjsamuels@gmail.com

- Prev by Date:
**st: RE: AW: RE: Average for panel data** - Next by Date:
**st: Re: Memory Settings** - Previous by thread:
**Re: st: Restricting range of values in a graph** - Next by thread:
**Re: st: Restricting range of values in a graph** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |