Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# RE: st: Stripplot: problem with axis for variable with only 2 observations

 From "Meghan Doiron, Miss" To "statalist@hsphsun2.harvard.edu" Subject RE: st: Stripplot: problem with axis for variable with only 2 observations Date Thu, 25 Oct 2012 16:35:38 +0000

```Dear statalist,

I am a first time poster and a relatively new Stata user. I would like to apologize for the errors in my original post.

Indeed I was misinterpreting the use of a bar plot and a box plot. In fact what I was trying to show was something of a box and whisker plot (with mean, median and range) superimposed over my points.

Thank you to Nick Cox for your help with the proper Syntax and for clarification.

In the interest of following the Statalist guidelines here is my original Syntax, followed by the solution:

original syntax:

stripplot price_yuca* if week_yuca!=6 & week_yuca!=11 & week_yuca!=12, stack bar over(who_yucalw) by(week_yuca, compact col(1) note("")) ysc(reverse) subtitle(, pos(9) ring(1) nobexpand bcolor(none) placement(e)) ytitle("") xtitle(Price yuca (soles))

Updated (with mean, median and range, and removed "bar" command)

egen mean=mean(price_yuca), by(who_yucalw week_yuca)
egen median=median(price_yuca), by(who_yucalw week_yuca)
egen min=min(price_yuca), by(who_yucalw week_yuca)
egen max=max(price_yuca), by(who_yucalw week_yuca)

stripplot price_yuca* if week_yuca!=6 & week_yuca!=11 & week_yuca!=12,stack over(who_yucalw) by(week_yuca, compact col(1) note(""))subtitle(, pos(9) ring(1)nobexpand bcolor(none) placement(e))ytitle("") xtitle(Price yuca (soles)) height(0.5) msize(*.8)addplot(scatter who_yuca2 mean, mcolor(orange) ms(Dh) msize(*1.2) ||scatter who_yuca2 median, mcolor(blue) ms(+) || rcap min max who_yuca2, horizontal)

----------------------------------
Meghan Doiron
MSc Candidate Geography
McGill University

Tel: +1 (514) 589-3686
Email: meghan.doiron@mail.mcgill.ca

________________________________________
From: owner-statalist@hsphsun2.harvard.edu [owner-statalist@hsphsun2.harvard.edu] on behalf of Nick Cox [njcoxstata@gmail.com]
Sent: Wednesday, October 24, 2012 7:32 PM
To: statalist@hsphsun2.harvard.edu
Subject: Re: st: Stripplot: problem with axis for variable with only 2 observations

On a little further thought, there is a simpler explanation.

Executive summary: Very wide confidence intervals are only to be
expected for samples of size 2.

Meghan is asking for a display of the confidence interval for the mean
based on a subsample of two observations. By default with -stripplot-
that is a t-based confidence interval provided by the -ci- default.
With a sample size that small, and provided that the two values are
distinct, the confidence interval will necessarily be very wide. (If
they aren't distinct, the calculation is impossible.)

An exactly pertinent example is to hand with the auto data.

. sysuse auto

. bysort rep78 : ci mpg

-----------------------------------------------------------------------------------------------------------------------
-> rep78 = 1

Variable |        Obs        Mean    Std. Err.       [95% Conf. Interval]
-------------+---------------------------------------------------------------
mpg |          2          21           3       -17.11861    59.11861

[output suppressed]

We can replicate that calculation. After -su mpg if rep78 == 1, d- the
standard deviation is accessible as r(sd) and we can calculate

. di 21 + invttail(1, 0.025) * r(sd)/sqrt(2)
59.118614

. di 21 - invttail(1, 0.025) * r(sd)/sqrt(2)
-17.118614

Two things conspire to make that interval wide, the fact that the
standard error is similar to the standard deviation (division by
square root of sample size is not doing much with sqrt(2)) and the
large t multiplier.

The example is doubly pertinent as it is plotted as the 7th graph on
the page cited by Meghan.

On Thu, Oct 25, 2012 at 12:04 AM, Nick Cox <njcoxstata@gmail.com> wrote:

> There are various problems here.
>
> 1. The program referred to, -stripplot-, is a user-written program
> from SSC; posters are asked to explain that.
>
> 2. Meghan refers to attachments, but attachments should not be sent to
> Statalist; in any case none are visible.
>
> 3. Meghan refers to the 9th graph down, but she is trying to emulate
> it with data we can't see and a command she does not give.
>
> I'll try to resolve this directly with Meghan.
>
> Nick
>
> On Wed, Oct 24, 2012 at 10:09 PM, Meghan Doiron, Miss
> <meghan.doiron@mail.mcgill.ca> wrote:
>
>> I am using a stripplot to compare community perceptions of prices versus actual market prices (rematista) and see how well the individual observations match up on different weeks. Stripplot seems to be the best option given that I have two categorical variables (week and community versus market) and one continuous variable: price. I am basically trying to recreate one of the stripplot graphs from http://www.survey-design.com.au/Usergraphs.html (9th graph down) with my data.  When I plot this using the "bar" option I get strangely large values on the x-axis, and the legend ranges from -100 to +100 even though my price variable only ranges between 5 and 20. By contrast, if I use the "box" option instead of "bar" this doesn't happen. I tried changing the axis but the stata command just ignores my request to makes the legend smaller. Attached are the graphs for reference.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```