Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: RE: adding mean plot to anovaplot


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: RE: adding mean plot to anovaplot
Date   Tue, 2 May 2006 11:19:24 +0100

I was dopey on this. As David Airey pointed out 
privately, it is pretty clear that John wants _observed_ means 
plotted as well as fitted means. 

This can be done quite(*) easily with a little preparation. 

Consider 

. sysuse auto, clear 

. anova mpg rep78 for 

                           Number of obs =      69     R-squared     =  0.2825
                           Root MSE      = 5.16246     Adj R-squared =  0.2256

                  Source |  Partial SS    df       MS           F     Prob > F
              -----------+----------------------------------------------------
                   Model |  661.189524     5  132.237905       4.96     0.0007
                         |
                   rep78 |  179.189006     4  44.7972516       1.68     0.1655
                 foreign |  111.773747     1  111.773747       4.19     0.0447
                         |
                Residual |  1679.01337    63  26.6510059   
              -----------+----------------------------------------------------
                   Total |   2340.2029    68  34.4147485   

. egen mean = mean(mpg) if e(sample) , by(rep78) 
(5 missing values generated)

Crucial detail: the -if e(sample)- can be important when 
there are missing values, to ensure that you get comparable results. 
It does no harm even if there aren't. This command must follow
the -anova-; otherwise e(sample) either isn't defined
or may be an inappropriate e(sample) left over from another 
model. 

If you want different means, the handle is the -by()- 
option. 

. anovaplot , plot(scatter mean rep78 , ms(Dh)) 
	legend(order(2 "Domestic" 3 "Foreign" 4 "observed means")) 

In short, the trick is to calculate the means separately
and then use -anovaplot-'s -plot()- option to show
them superimposed. If you think that's clever, it is, 
and all your applause should be directed at StataCorp 
for inventing it. Stata 9 users should note that 
the last version of -anovaplot- I know about was 
for Stata 8, so -addplot()-, the same thing but
with a Stata 9 name, does not yet work. 

The tricky bit is getting the legend right. This 
issue arose recently in a discussion of the user-
written program -glcurve- by P. van Kerm and S. Jenkins. 
The best tip is to view the source to see what -anovaplot-
is doing under the {hood | bonnet}. 

(*) "quite" in English, meaning British, meaning
"moderately". "quite" in American appears to 
mean "extremely". Thus an American speaker at a London 
Stata users' meeting who thanked a questioner for 
a "quite helpful" comment got some quite puzzled 
looks from the audience. 

Nick 
n.j.cox@durham.ac.uk 

Nick Cox
 
> For those not in the know, -anovaplot- is 
> a user-written command. 
> 
> . search anovaplot 
> 
> points to a write-up: 
> 
> SJ-4-4  gr0009  . . . . . . . . . . Speaking Stata: Graphing 
> model diagnostics
>         (help anovaplot, indexplot, modeldiag, ofrtplot, ovfplot,
>         qfrplot, racplot, rdplot, regplot, rhetplot, rvfplot2,
>         rvlrplot, rvpplot2 if installed)
>         Q4/04   SJ 4(4):449--475
>         plotting diagnostic information calculated from residuals
>         and fitted values from regression models with continuous
>         responses
> 
> Now on the question, I'm not clear what John wants 
> that -anovaplot- does not provide, 
> as the main purpose of -anovaplot- is precisely to 
> show means according to anova factors. 
> 
> Thus 
> 
> . sysuse auto, clear
> 
> . anova mpg rep78 foreign
> 
>                            Number of obs =      69     
> R-squared     =  0.2825
>                            Root MSE      = 5.16246     Adj 
> R-squared =  0.2256
> 
>                   Source |  Partial SS    df       MS         
>   F     Prob > F
>               
> -----------+----------------------------------------------------
>                    Model |  661.189524     5  132.237905      
>  4.96     0.0007
>                          |
>                    rep78 |  179.189006     4  44.7972516      
>  1.68     0.1655
>                  foreign |  111.773747     1  111.773747      
>  4.19     0.0447
>                          |
>                 Residual |  1679.01337    63  26.6510059   
>               
> -----------+----------------------------------------------------
>                    Total |   2340.2029    68  34.4147485   
> 
> . anovaplot
> 
> gives me two parallel segmented lines shows means fitted 
> as a function of the factors, plus point symbols for the 
> data. 
> 
> (For some unknown reason, ANOVA people tend to plot just 
> means, and not the original data, but the author of -anovaplot-
> evidently does not approve. Any regression person
> showing just a straight line would get told pretty promptly
> to add the data by any competent refereee or boss.) 
> 
> Nick 
> n.j.cox@durham.ac.uk 
> 
> John Novak
>  
> > I would like to add a plot of the treatment means collapsed 
> > across the by 
> > variable to an -anovaplot-.  I have done this:
> > 
> > #delimit ;
> > quietly anova y a b a*b;
> > anovaplot ,
> >          scatter(msymbol(i) xsize(3) ysize(3) name(by_b, replace))
> >          plot(mband y a) ;
> > delimit cr
> > 
> > It is almost what I want, but adds the median band instead 
> of a mean 
> > band.  Does anyone know how I can accomplish the same effect, 
> > but with 
> > means instead of medians?

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index