# st: RE: -ciplot- and multiple by options

 From "Nick Cox" <[email protected]> To <[email protected]> Subject st: RE: -ciplot- and multiple by options Date Mon, 8 Oct 2007 19:48:25 +0100

```The Statalist FAQ advises, as good practice, saying where
non-official commands come from. -ciplot- is a user-written
command on SSC.

The help for -ciplot- starts like this:
-------------------------------------------------------------------------

Plots of confidence intervals

ciplot varlist [if exp] [in range] [weight] [, by(byvar)
missing ci_options [horizontal | vertical]
rcapopts(rcap_options) scatter_options]

Description

-ciplot- produces a display of means and confidence intervals.  Means
are shown by point symbols and intervals by capped bars.  -ci- is
used for the calculations.  -aweights- and -fweights- are allowed; see
help on -weights-.

Options

-by()- defines a grouping variable, which is treated as categorical,
not measured.

[...]

scatter_options refers to options of graph twoway scatter.
----------------------------------------------------------------

and inside -ciplot- there is a standard -syntax- statement

syntax varlist(numeric) [if] [in] [aweight fweight]        ///
[ , BY(varname) LEVel(integer \$S_level) Poisson BINomial   ///
Exposure(varname) EXAct Jeffreys Wilson Agresti Total      ///
Total2(str asis) MISSing INCLusive                         ///
YTItle(str) XTItle(str)                                    ///
HORizontal VERTical RCAPopts(str asis) plot(str asis) * ]

As far as -ciplot- is concerned -by()- needs a varname (not varlist),
as Sergiy reports.

So why can he get away with specifying two -by()- options? It
is not in general illegal to specify an option twice, and as from
Stata 8 much use is made of that in graphics.

In this case, what happens with two -by()- options is that the
first becomes the -by()- specified explicitly in the -syntax- and
the second gets smuggled past the border wrapped up in the
wildcard *.

-ciplot, by()- was intended to echo -ci, by()- when the latter
was part of -ci-'s standard syntax. (Now the overt syntax is
to use -by ...: ci- but a -by()- option is still allowed.)
This option never had a graphical role.

As the help above implies, -ciplot- does two main things,
fire up -ci- to get confidence intervals and then fire
up -graph- to plot them. If you go with the auto data

ciplot price, by(rep78) by(foreign)

then -by(rep78)- is used in calculating
confidence intervals and -by(foreign) is used
in plotting them. The results will look odd
and in effect you will get a silly answer
to a silly question.

I don't feel especially inclined, however, to regard this as a
bug, as any user doing this is trying something way
beyond the documentation and should expect the
possibility of strange consequences. There may be nested
categorical variables for which this is
an undocumented way of getting a reasonable graph.

I suspect that -ciplot- has climbed to the end
of its branch of the evolutionary tree. A more
versatile command is -stripplot- from SSC.
Alternatively, use -egen, group()- before
using -ciplot-.

Nick
[email protected]

> In -ciplot- I can not specify multiple variables in the "by()" option,
> but I can specify multiple "by()" options each with one variable. I
> wonder if ciplot is supposed to work like this?
>
> The results look quite strangely [Stata v9.2, Windows]:
>
>    sysuse auto
>    ciplot price, by(foreign) by(rep78)
>
> or even like this:
>
>    ciplot price, by(rep78) by(foreign)
>
> Is there any deep statistical motivation for not allowing such graphs?
> or is it just bug?
>
> In any case the above graphs look misleading, when confronted
> with the data:
>
>    table rep78 foreign, c(mean price sd price)
>
> Am I missing something?

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```