[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: RE: RE: RE: truncating graph range |

Date |
Wed, 2 Nov 2005 12:52:41 -0000 |

Next door to "Can you do X in Stata?" is "Should you do X anyway?". I focused on the first in my previous replies, but the second is crucial too. Nothing to follow affects the key facts that this is your report and you should know the audience, but I have comments on various levels. -1. What is Excel? 0. Showing the data fully and honestly is by far the best strategy, and the reasons for not doing that should be clear and overwhelming. 1. If your readers are confused by logs, one option is to explain them. ("The Economist" regularly uses log scales without apology.) The first graph could explain what log scales are. 2. If your outliers are so extreme that you want to exclude them from many graphs, does the rest of your analysis cope with the outliers optimally? 3. Some things can be done without low-level programming. I generated some spiky series clear set seed 2803 set obs 100 forval i = 1/10 { gen y`i' = 1/uniform() } su gen x = _n and then generated a series of graphs in uniform style. The idea was to omit values above a threshold, but to show the values as text just above that threshold and to show the omission explicitly by a break in the line. This falls short of lines pointing towards the outlier. You would need to choose your own threshold and might not need to set -mlabangle(vertical)-, which leads to giraffe graphics, but it seems quite likely that in real data, even more than in random data, outliers may be next to each other. qui gen show = "" gen high = 105 forval i = 1/10 { clonevar temp = y`i' qui replace temp = . if y`i' > 100 qui replace show = string(y`i', "%7.1g") if y`i' > 100 line temp x , cmissing(n) ysc(r(0,105)) || /// scatter high x , ms(none) mlabpos(6) mla(show) mlabangle(vertical) /// legend(off) ytitle(y`i') more qui replace show = "" drop temp } Nick n.j.cox@durham.ac.uk Timothy Dang > Thanks Nick & Allan. I have a lot of graphs I'm going to need to put > together for an appendix, and I was thinking that automating it with > Stata would give me the uniform appearance I wanted relatively > painlessly, except for that outlier problem. I'm pretty sure that in > almost all the cases a log scale would be confusing to the reader, so > I don't want to go with that. > > Allan, I'll play with your suggestions and see if they do the trick > for me. Otherwise, it may be graphing in Excel, which does this > readily. > > In the spirit of Nick's reminder on closing threads, I'll come back > and report if there's a solution that worked for me. > > Thanks! > > On 10/31/05, Nick Cox <n.j.cox@durham.ac.uk> wrote: > > Allan's main advice is to fit a regression line with > > an outlier and to show it on a graph together > > with all the other data. > > > > That's often a useful technique, but I read > > Timothy's discussion of line plots as wanting > > something quite different, namely > > > > * outlier off graph > > > > / \ > > / \ > > / \ lines on graph pointing to it > > > > I am sure that this is programmable, but I don't know an > > easy and general way for a user to do it. > > > > Likewise Allan's other suggestions do not seem to bear > > on this problem. > > > > Nick > > n.j.cox@durham.ac.uk > > > > Allan Reese (Cefas) > > > > > Hate to disagree with Nick, but Stata is well-designed for > > > intelligent graph editing. Timothy maybe needs to fiddle > > > with a few alternatives and work out what would show what he > > > intends. A log scale is one option but has many other > implications. > > > > > > For example, it's straightforward in Stata to draw lines > > > with/without outliers. Other "point'n'click" packages don't > > > make this easy, so suppress the desire. > > > > > > fit y x > > > predict yhat1 > > > fit y x if y<1000 > > > predict yhat2 > > > scatter y yhat1 yhat2 x if y<1000, connect(. l l) msym(o i i) sort > > > > > > Another simple trick is to copy one variable into several, so > > > subsets can be distinguished on the plot. You could automate > > > this (eg, using egen to save the max value of x), but I'd > > > usually do it as part of visual editing, for example to add > > > text labels to the points at the end of each line. It's > > > therefore feasible to draw a line for the data excluding the > > > outlier, and add a second line in different style pointing up > > > with a label at its end describing the outlier. > > > > > > This is the type of work where I'd draft commands in a DO > > > file so they are easily modified and re-run. > > > > Nick Cox > > > > > What you want is _not_ straightforward. I know no easy and also > > > general way of omitting a data point from a Stata graph and also > > > having it exert some offstage influence on the remainder of > > > the graph. > > > > > > In my experience, when people think they want something like > > > this using a logarithmic scale for the variable concerned is > > > usually the > > > best way forward. > > > > Timothy Dang > > > > > > I'm making a lot of (line) plots in Stata, and mostly > it's working > > > > great, but I've hit a snag. For a few of my data sets, > > > there are some > > > > data points which are extraordinarily high. With the > automatically > > > > scaled axis ranges, these points are visible, but all > the detail of > > > > the rest of the data is shrunk to invisibility. > > > > > > > > So, I want to: > > > > (a) enforce a maximum for the axis, hopefully showing the > > > lines going > > > > up towards some point not shown on the plot, and > > > > (b) add some text describing what happens at those > points (I can do > > > > this outside Stata if needed). > > > > > > > > Hopefully this is straightforward and I've just missed > something. > > > > Thanks for any pointers. * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: RE: RE: RE: truncating graph range***From:*Timothy Dang <tpondang@gmail.com>

- Prev by Date:
**Re: st: using Stata to calulate prob** - Next by Date:
**st: New on ssc-ideas - xtivreg2** - Previous by thread:
**Re: st: RE: RE: RE: truncating graph range** - Next by thread:
**Re: st: RE: RE: RE: truncating graph range** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |