Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Query..

From   David Hoaglin <>
Subject   Re: st: Query..
Date   Tue, 16 Apr 2013 18:43:48 -0400


That's a reasonable request.  I'm not able to give an extensive list
at the moment, but I can give a few examples.  I have the third

Let me repeat that I have found the book very helpful in learning to use Stata.

At the end of Section 5.6 (page 112) the discussion of the box plot
says, "Lines extend from the edge of the dark-gray box 1.5 box
lengths, or until they reach the largest of smallest cases.  Beyond
this, there are some dots representing outliers or extreme values." In
fact, the lines extend only as far as the most extreme observations
that are not more than 1.5 box lengths from the edge of the box.
Observations more extreme than that are plotted individually, but they
are not necessarily outliers.  In exploratory data analysis, such
observations are called "outside."  They deserve to be investigated,
and they might then be determined to be outliers.

In Section 10.3 (page 248) the interpretation of unstandardized
regression coefficients is oversimplified and unsatisfactory: "For
example, a 1-unit change [increase] in com3, identification with the
community, produces a 0.05-unit change in env.con, environmental
concern, holding all other variables constant."  The problem is the
phrase "holding all other variables constant," which does not reflect
the way multiple regression works.  I realize that other books give
the same interpretation, but that does not make it satisfactory.  The
students who learn about multiple regression from those books are
being done a disservice.

The title of Section 10.5 asks "Is the dependent variable normally
distributed?"  This is not a useful question.  The component of the
regression that one would like to have follow a normal distribution is
the fluctuations or "errors," whose behavior is examined via the
residuals (discussed in Section 10.6).  I was delighted to see the
hanging rootgram in Figure 10.4 (page 252).  An important reason for
using a hanging rootgram (or histogram) is that it brings the
departures from the fitted normal density together along the
horizontal line at 0.  A suspended rootogram would be better, because
then positive deviations would extend up from that line and negative
deviations would extend down.

On page 253, the discussion of values of kurtosis that depart from
that for the normal distribution (3.00) is reversed: "A value of less
than 3.00 means that the tails are too thick (hence, too flat in the
middle), and a value of greater than 3.00 means that the tails are too
thin (hence, too peaked in the middle)."  In fact, heavier-than-normal
tails correspond to kurtosis > 3, and lighter-than-normal tails
correspond to kurtosis < 3.

The discussion on page 254 confuses robust regression with "robust"
standard errors.  The -regress- command continues to use ordinary
least squares.

David Hoaglin

On Tue, Apr 16, 2013 at 11:45 AM, Lachenbruch, Peter
<> wrote:
> David -
> It would be good for you to specify what you find problematic with Acock's book.  I've used it and not had any problems - but maybe i'm just ancient and not seeing issues
> Peter A. Lachenbruch,
> Professor (retired)
*   For searches and help try:

© Copyright 1996–2016 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index