Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: graphing estimates and confidence intervals


From   Nick Cox <njcoxstata@gmail.com>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   st: graphing estimates and confidence intervals
Date   Fri, 10 May 2013 13:52:23 +0100

Two recent threads both centred on graphical display of estimates
together with confidence intervals: The start points were

http://www.stata.com/statalist/archive/2013-05/msg00293.html

http://www.stata.com/statalist/archive/2013-05/msg00310.html

This post is intended mainly as a kind of broad-brush overview of the
question. It also adds some detail omitted from those threads. In
turn, naturally, please comment if I miss anything of importance or
interest.

The main idea is that while estimates can be plotted easily with
-twoway scatter- or -graph dot- you are in practice going to find it
difficult to show confidence intervals directly other than by -twoway
rcap-. (It's only convention that might inhibit you from using -twoway
rspike- instead.) It follows that you need to focus on using -twoway-.
Bluntly, -graph dot- (or -graph bar- for those so inclined) is a dead
end here.

There are two broad strategies.

1. You can build your own command by assembling a composite -twoway-
call using -scatter- for the point estimates and -rcap- for the
intervals. This can be combined, with increasing difficulty, with
showing different results for different groups on one or more levels.
An example to explain levels here: using sex as a classifier gives one
level and using race or region or both would add one or two more
levels.

With one level you will presumably just want to plot your grouping
variable on one of the axes.

With two or more levels, using -by()- is the easiest approach to add
an extra level of classification, but just adding spacing can be as or
more effective. Sometimes with -by()- there is too much scaffolding
and too much loss of real estate.

If you have any group variable that is string, things are easier if
you -encode- it or use -egen, group()- to produce an equivalent
numeric variable with value labels.

2. Alternatively, you can look for a command that does all that for
you. The commands differ in whether they expect that you already have
the estimates (point and interval) or they will undertake to do that
calculation for you. The more standard the calculation, the more
likely that a canned command already exists.

-serrbar- is an old official command which doesn't do much but may
match simple needs. My impression is that it is little known, but that
may be because it is little mentioned, and that in turn because it is
of little use.

-dotplot- is an official command which supports display of mean +/-
SD. It's worth knowing that, but it's unlikely to be what you want
under this heading.

-ciplot- is an oldish user-written command (SSC, Nick Cox). Its basic
idea is to call up -ci- repeatedly and then plot the results. There is
support for multiple groups and multiple variables. If it doesn't go
as far as you want, the bad news is that I have no interest in
developing it, but it's more flexible than any official command I can
recall. For example,

sysuse auto
ciplot foreign , binomial jeffreys by(rep78)

shows how you can reach through to -ci-.

-stripplot- (SSC, Nick Cox) was mentioned in recent posts. Its display
of confidence intervals is based on exactly the same idea as -ciplot-,
to call up -ci- for the calculations. Its philosophy is to show the
raw data too, although nothing beyond an ectoplasmic sense of my mild
disapproval stops you suppressing the data display with e.g
-ms(none)-.

-eclplot- (SSC also SJ, Roger Newson) is another user-written command,
and one characteristically well thought out, documented and
maintained. It's not competing because it is focused on a different
case, in which you already have estimates and confidence limits to
hand; other programs of Roger's are of much help in assembling and
analysing such results.

I want to flag strongly the scope for using -statsby- in this
territory, which I wrote up in

SJ-10-1 gr0045  . . . . . . . . . . . . . Speaking Stata: The statsby strategy
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
        Q1/10   SJ 10(1):143--151                                (no commands)
        demonstrates the use of statsby to prepare a reduced
        dataset for subsequent graphing

.pdf freely available at
http://www.stata-journal.com/sjpdf.html?articlenum=gr0045

Confidence intervals are a major example. (That paper was inspired by
a single throw-away remark by Vince Wiggins. It was one of many
occasions in which deciding to write about something made me aware of
something in Stata I was underestimating.)

I would also like to mention a general discussion of graphical technique in

SJ-8-2  gr0034  . . . . . . . . . .  Speaking Stata: Between tables and graphs
        (help labmask, seqvar if installed) . . . . . . . . . . . .  N. J. Cox
        Q2/08   SJ 8(2):269--289
        outlines techniques for producing table-like graphs

.pdf freely available at
http://www.stata-journal.com/sjpdf.html?articlenum=gr0034

Nick
njcoxstata@gmail.com
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index