[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: -bandplot- available from SSC |

Date |
Mon, 24 Nov 2008 14:51:24 -0000 |

Thanks to Kit Baum, a new package -bandplot- is available from SSC. The name "band plot" is my own. I wrote this program to do what it does and needed a name. I toyed briefly with names like -suygrxplot- but decided that I would probably forget such names myself. There is a small risk of this program being misunderstood as drawing one or more (coloured) bands to represented stacked series, but that is not what I am about. -bandplot- requires Stata 8. Some more details follow my signature, but the help file includes them all (and more). You can install -bandplot- using . ssc inst bandplot as usual. The package includes a rudimentary demo file, bandplottest.do. Open a new directory or folder and run it to see some example graphs. Nick n.j.cox@durham.ac.uk -bandplot- produces plots showing summary statistics of one or more response variables for bands of one or more predictor variables. By default, -bandplot- is a wrapper for graph dot. Optionally, bandplot can be specified to be a wrapper for -graph hbar- or -graph bar-. There are two syntaxes. In the first, -bandplot- takes the first variable in a varlist to be a response variable yvar, which is summarised for observations in each of various bands of the other predictor variables xvars. In the second, -bandplot- takes two or more variables specified first within parentheses () as being response variables yvars; all subsequent variables are then taken to be predictors xvars. By default, -bandplot- shows means. Any other statistics produced by -summarize- may be specified. Note that with two or more yvars only one statistic may be shown. "Bands" are to be interpreted as follows. By default numeric variables are divided into quantile-based bands. (By default in turn quartile-based bands are used.) Alternatively, variables can be declared explicitly or implicitly as categorical, in which case the distinct values of each such variable are used as bands. Any string variables specified as xvars are treated as categorical, regardless of any other specifications. No string variables may be specified as yvars. The idea of showing summaries of responses for bands of one or more predictors evidently has a long history, which is difficult to trace. Plots summarizing polls or elections in terms of votes for major parties or candidates broken down separately by categorical variables such as sex, age, race or region are common. The particular choices here were inspired largely by examples given by Harrell (2001). See his pp. 126, 303f, 314f, 336. What -bandplot- offers is perhaps best explained by a direct comparison with -graph dot-. There are three major differences and several minor differences. (Similar comments apply to -graph bar- or -graph hbar- if either is invoked.) First, consider an example with the auto data. Compare . graph dot (mean) mpg, over(foreign) over(rep78) and . bandplot mpg foreign rep78, cat(foreign rep78) The -graph dot- command shows means of mpg for the cross-combinations of foreign and rep78 occurring in the data, i.e. one variable's classes are nested inside the other's. The -bandplot- command shows means of mpg separately for classes of each variable. Second, -bandplot- supports quantile-based bands on the fly. You could show those with -graph dot-, but you would need to create any variables classed into bands first, say by using -xtile-. Third, -graph dot- typically carries out a temporary reduction of the dataset, but -bandplot- carries out its own reduction and passes the results to -graph dot- for plotting -asis-. Various options of -graph dot- are thus irrelevant or inappropriate so far as -bandplot- is concerned. Further, variables in the dataset are not accessible to the -graph dot- command. -bandplot- does not offer any rounding or coarsening option such as might be used to bin numeric variables into equal intervals. You would need to do that first. Advice is to use -clonevar- to create a copy of a variable (notably, keeping the variable label) and then to replace that with a binned version using a function such as -round()-, -floor()- or -ceil()-. Then declare such variables to -bandplot- as categorical [sic]. Although -bandplot- ignores missing values on the yvars, the structure of such missing values may be explored by creating an indicator for missingness using -missing()-. Harrell, F.E. 2001. Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis. New York: Springer. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**st: RE: -bandplot- available from SSC***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

**Re: st: -bandplot- available from SSC***From:*David Airey <david.airey@Vanderbilt.Edu>

- Prev by Date:
**st: GLLAMM multinomial: tremendous instability** - Next by Date:
**Re: st: GLLAMM multinomial: tremendous instability** - Previous by thread:
**st: GLLAMM multinomial: tremendous instability** - Next by thread:
**Re: st: -bandplot- available from SSC** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |