Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: Histogram, by(var, total)


From   "Thoma, Marie E." <mthoma@jhsph.edu>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   st: RE: Histogram, by(var, total)
Date   Mon, 8 Jun 2009 12:56:04 -0400

Thank you Nick and Martin.  I will try both suggestions and see what happens!
(I hope this email goes through this time, I had problems previously)

Marie

________________________________________
From: owner-statalist@hsphsun2.harvard.edu [owner-statalist@hsphsun2.harvard.edu] On Behalf Of Nick Cox [n.j.cox@durham.ac.uk]
Sent: Sunday, June 07, 2009 1:14 PM
To: statalist@hsphsun2.harvard.edu
Subject: st: RE: Histogram, by(var, total)

I don't know a direct way to do this, but some trickery produces the
same result. It is best explained by example.

sysuse auto, clear
preserve
local which = _N + 1
expand 2
replace foreign = -1 in `which'/L
label def origin -1 "Total", add
histogram mpg, by(foreign)
restore

Key points:

1. -expand 2- doubles the dataset. The second half that is a copy of the
first half is to be used to work out the "Total" category.

2. If the -by()- variable is integer with value labels, the extra
observations should be assigned an integer value for the -by()- that is
_lower_ than any other observed. You then need to define an appropriate
value label. (In this case, I know that the other values are 0 and 1.)

3. You do _not_ then specify the -total- suboption, as you are using
your own subterfuge to replicate it.

4. -preserve- and -restore- are optional, but note otherwise that this
is a major change to the dataset.

Note that Stata in no sense "knows" that the extra category is a total
category, but that shouldn't matter.

Now what would be done if the -by()- variable were string? At first
sight, we have a problem here because "Total" would not necessarily sort
first in a set of alphanumeric categories. We could use some label like
"All observations" but what then if we have "Aardvarks" as a category?

Here is a better trick (not to rule out the possibility of an even
better trick):

sysuse auto, clear
preserve
decode foreign, gen(Foreign)
local which = _N + 1
expand 2
replace Foreign = " Total " in `which'/L
histogram mpg, by(Foreign)
restore

The -decode- is just to produce an example with an appropriate string
variable. In practice it will exist already. Notice the two small parts
to the trick:

(a) Putting a space before the "Total" makes it more likely to sort to
the beginning of any set of categories. The space " " is a character
too.

(b) Putting a space afterwards ensures that the "Total" is still centred
on the graph (if you care about that).

But we need not worry too much about the string case. If you can't get
the order you want, map the strings to integers with value labels.

Naturally, nothing here is distinctive to histograms.

Nick
n.j.cox@durham.ac.uk

*From:* Thoma, Marie E.

I would like to use the "histogram yvar, by(xvar, total)" command to
produce a histogram of the total and stratified variable.  However, in
Stata, it places the "total" graph as the last graph and I would like to

have it as the first graph (before the stratified graphs).

Does anyone know how to change this either using this command or another

way to accomplish this same layout?

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index