[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: yet another update of hangroot available

From	Maarten buis <[email protected]>
To	stata list <[email protected]>
Subject	st: yet another update of hangroot available
Date	Mon, 23 Nov 2009 07:21:54 -0800 (PST)

Thanks to Kit Baum a new version of the -hangroot- package is
availabele from SSC. This update introduces the option to 
display a suspended rootogram rather than a hanging rootogram. 
This program can be installed by typing in Stata 
-ssc install hangroot-, or updated by typing -adoupdate, update-
or -ssc install hangroot, replace-.

Both graphs are designed graphically compare an empirical 
distribution to a theoretical distribution. The idea behind 
these graphs is  best explained by showing example graphs. 
These can be seen here: 
<http://www.maartenbuis.nl/software/hangroot.html>. However, 
here is, for completenes sake, a verbal description of these
graphs:

The hanging rootogram draws the theoretical distribution and
"hangs" the histogram bars representing the empricial 
distribution from it rather than "standing" these bars on the
x-axis. This way deviations from the theoretical distribution
are visible as deviations from the horizontal line y=0. This
makes it easier to spot patterns in these deviations.

The suspended rootogram takes this graph one step further. It
recognizes that the key information in the hanging rootogram
are not the histogram bars but its deviations from the line
y=0, so why not disply these residuals directly? It than 
makes sense to flip the entire graph upside down, "suspending"
the theoretical distribution from the x-axis, because positive
residuals now represent too many observations in a bin and
negative residuals represent too few. We can optionally suppress
the display of the theoretical distribution, focussing entirely
on the residuals.

Another characteric of both the hanging rootogram and the 
suspended rootogram is that they are showing the freqencies on
the square root scale.  This way the sampling variation of the 
length of the bars representing the empirical distribution are 
stabelized.  These lengths are counts of the number of 
observations that fall within each bin, and larger counts tend 
to have larger sampling variation than smaller counts, making 
it harder to compare the deviations across bins. By taking the 
square root, the sampling variations tends to be approximately 
equal across bins, facilitating the comparison across bins. 
Moreover, this tends to make deviations in the tails, where the 
counts are small, more visible.

-- Maarten

--------------------------
Maarten L. Buis
Institut fuer Soziologie
Universitaet Tuebingen
Wilhelmstrasse 36
72074 Tuebingen
Germany

http://www.maartenbuis.nl
--------------------------


      

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Prev by Date: st: re: overid error
Next by Date: Re: st: combine indicator values
Previous by thread: st: graphing ordinal and continuous variables
Next by thread: st: Stata resolution
Index(es):
- Date
- Thread