Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Stripplot: problem with axis for variable with only 2 observations

 From Nick Cox To statalist@hsphsun2.harvard.edu Subject Re: st: Stripplot: problem with axis for variable with only 2 observations Date Thu, 25 Oct 2012 00:32:39 +0100

```On a little further thought, there is a simpler explanation.

Executive summary: Very wide confidence intervals are only to be
expected for samples of size 2.

Meghan is asking for a display of the confidence interval for the mean
based on a subsample of two observations. By default with -stripplot-
that is a t-based confidence interval provided by the -ci- default.
With a sample size that small, and provided that the two values are
distinct, the confidence interval will necessarily be very wide. (If
they aren't distinct, the calculation is impossible.)

An exactly pertinent example is to hand with the auto data.

. sysuse auto

. bysort rep78 : ci mpg

-----------------------------------------------------------------------------------------------------------------------
-> rep78 = 1

Variable |        Obs        Mean    Std. Err.       [95% Conf. Interval]
-------------+---------------------------------------------------------------
mpg |          2          21           3       -17.11861    59.11861

[output suppressed]

We can replicate that calculation. After -su mpg if rep78 == 1, d- the
standard deviation is accessible as r(sd) and we can calculate

. di 21 + invttail(1, 0.025) * r(sd)/sqrt(2)
59.118614

. di 21 - invttail(1, 0.025) * r(sd)/sqrt(2)
-17.118614

Two things conspire to make that interval wide, the fact that the
standard error is similar to the standard deviation (division by
square root of sample size is not doing much with sqrt(2)) and the
large t multiplier.

The example is doubly pertinent as it is plotted as the 7th graph on
the page cited by Meghan.

On Thu, Oct 25, 2012 at 12:04 AM, Nick Cox <njcoxstata@gmail.com> wrote:

> There are various problems here.
>
> 1. The program referred to, -stripplot-, is a user-written program
> from SSC; posters are asked to explain that.
>
> 2. Meghan refers to attachments, but attachments should not be sent to
> Statalist; in any case none are visible.
>
> 3. Meghan refers to the 9th graph down, but she is trying to emulate
> it with data we can't see and a command she does not give.
>
> I'll try to resolve this directly with Meghan.
>
> Nick
>
> On Wed, Oct 24, 2012 at 10:09 PM, Meghan Doiron, Miss
> <meghan.doiron@mail.mcgill.ca> wrote:
>
>> I am using a stripplot to compare community perceptions of prices versus actual market prices (rematista) and see how well the individual observations match up on different weeks. Stripplot seems to be the best option given that I have two categorical variables (week and community versus market) and one continuous variable: price. I am basically trying to recreate one of the stripplot graphs from http://www.survey-design.com.au/Usergraphs.html (9th graph down) with my data.  When I plot this using the "bar" option I get strangely large values on the x-axis, and the legend ranges from -100 to +100 even though my price variable only ranges between 5 and 20. By contrast, if I use the "box" option instead of "bar" this doesn't happen. I tried changing the axis but the stata command just ignores my request to makes the legend smaller. Attached are the graphs for reference.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```