# RE: st: RE: combining frequency tables

 From "Nick Cox" To Subject RE: st: RE: combining frequency tables Date Wed, 25 Feb 2009 11:16:23 -0000

```I wasn't paying attention to the implications of the fact that you were
doing this under -by:-, but a very little experimentation and analysis
is enough to clarify what is happening.

Your speculation is the wrong way round. It's not a question of what
-matrix- will "accept"; if there were a problem with that there would be
an error message. It's a question of what is left behind earlier that
-matrix- can get hold of.

Forget Rose Ann Medeiros' -mdesc- and my hack -mmdesc- and focus on
something accessible to all.

. sysuse auto
(1978 Automobile Data)

. bysort foreign : su mpg

------------------------------------------------------------------------
---------
-> foreign = Domestic

Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
mpg |        52    19.82692    4.743297         12         34

------------------------------------------------------------------------
-------
-> foreign = Foreign

Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
mpg |        22    24.77273    6.611187         14         41

. ret li

scalars:
r(N) =  22
r(sum_w) =  22
r(mean) =  24.77272727272727
r(Var) =  43.70779220779221
r(sd) =  6.611186898567625
r(min) =  14
r(max) =  41
r(sum) =  545

This is typical (although not universal) of how -by:- works with r-class
results. What happens is that

1. -summarize- does the work for Domestic cars, showing the results and
leaving them in r-class results, such as r(N), r(mean), etc.

2. -summarize- does the work for Foreign cars, showing the results and
leaving them in r-class results, such as r(N), r(mean), etc.

What's left behind -- and what you can access therefore -- is just the
_last_ set of r-class results. As it goes through the subsets, Stata
overwrites each set of r-class results with the next. This is generic of
byable commands (in Stata's jargon) written as -recall- commands not
-onecall- commands.

Look at the code for -mdesc- and -mmdesc- and you will see that they are
-recall- commands.

What's to be done? Here are three solutions; there are probably others
too.

1. Rewrite -mdesc- or -mmdesc- to be -onecall- commands. That's a job
for a Stata programmer.

2. Use -statsby-, say with -mdesc-. (If you don't know about -statsby-,
find out anyway.)

3. Call -mmdesc- in a loop and accumulate results. There's more on
technique at

FAQ     . . . . . . . . . . Making foreach go through all values of a
variable
8/05    Is there a way to tell Stata to try all values of a
particular variable in a foreach statement without
specifying them?
http://www.stata.com/support/faqs/data/foreach.html

This could be something like

egen which = group(treatment timpoint), label
su which, meanonly
local ng = r(max)

forval i = 1/`ng' {
di "`: label (group) `i''"
mmdesc use1 use2
matrix all = nullmat(all) \ r(table)
}

I am not paying attention to your extra -if- conditions which impart
another level of complexity. You would need to call -mmdesc- for each
distinct set of conditions. After each call to -mmdesc- you would need
to copy the resulting matrix before it was overwritten.

Nick
n.j.cox@durham.ac.uk

Thanks, Nick.

For some reason, I am having no joy using the -mat- option. This may be
because I am using -by- option with -mmdes- and -mat- is not willing to
accept (at least in my case) multiple rows that represented different
-by-
categories. I am still getting a single row matrix even for the first
step
before I join two matrices.

I think I will do this in Excel.

On Feb 23 2009, Nick Cox wrote:

>-findit mdesc- reveals that it is
>a program by Rose Anne Medeiros, who once wrote a program called -njc-,
>for which I am not responsible. Here is a matrix version of her
program,
>for which she is not responsible.
>
>*! mmdesc 1.0 NJC 23 February 2009
>* mdesc 1.0 RAM 18 JULY 2008
>* Returns a table with number missing, total, and missing/total
>program mmdesc, rclass byable(recall)
>version 8
>syntax [varlist] [if] [in]
>tempvar touse
>mark `touse' `if' `in'
>local nvars : word count `varlist'
>tempname matrix
>matrix `matrix' = J(`nvars', 3, .)
>local i = 1
>quietly foreach var of local varlist {
>	count if missing(`var') & `touse'
>	matrix `matrix'[`i', 1] = r(N)
>	count if `touse'
>	matrix `matrix'[`i', 2] = r(N)
>	matrix `matrix'[`i', 3] = `matrix'[`i',1] / `matrix'[`i',2]
>	local ++i
>}
>matrix rownames `matrix' = `varlist'
>matrix colnames `matrix' = Missing Total Missing/Total
>matrix list `matrix', noheader
>return matrix table = `matrix'
>end
>
>Once you run this, you can save its results by e.g.
>
>mat A = r(table)
>
>...
>
>mat B = r(table)
>
>...
>
>mat C = A + B
>
>
>Nick
>n.j.cox@durham.ac.uk
>
>
>Another (perhaps) simple question: is it possible to add different
>frequency tables together to form a combined single table in Excel?
>
>I have two group variables (say, var1: type of treatment; and var2:
time
>
>point of observation) and 5 use variables. So an observation would be
>for
>treatment A at time point 1. I want to know the number of missing
values
>
>for each use variable for each treatment group and each time point.
Also
>
>for each use variable there are 'if' conditions, so I can't do the
>frequency count in one go. Here is what I am doing using a user written

>programme -mdesc- or -missing-:
>
>by treatment timpoint: mdesc use1 if (condition1 condition2)
>
>by treatment timpoint: mdesc use2 if (condition3 condition4)
>
>and so on.
>
>
>Now I want to combine all these counts for all timepoints and treatment

>groups to get a single table that I can export to Excel. Is there any
>way I
>could do that?

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```