Your example command has the form
two or more variables
BY
one or more statistics
BY
categories of a variable.
These requests are, logically, for several matrices,
not one.
Your assertion that -statsmat- and -tabstatmat-
have "issues" -- as if they were in need of counselling
or psychoanalysis -- is correct. Both of these
commands are, as is totally explicit in their help,
designed to produce individual matrices. Their
issue is this: they are unable to understand a request
for which they were not designed. I am sorry that the
help was not clear enough for you to understand this,
but the statement is there in both files.
The point can be made better by looking at an example
similar to yours.
. tabstat turn trunk length, stat(mean sd) by(foreign) save
Summary statistics: mean, sd
by categories of: foreign (Car type)
foreign | turn trunk length
---------+------------------------------
Domestic | 41.44231 14.75 196.1346
| 3.967582 4.306288 20.04605
---------+------------------------------
Foreign | 35.40909 11.40909 168.5455
| 1.501082 3.216906 13.68255
---------+------------------------------
Total | 39.64865 13.75676 187.9324
| 4.399354 4.277404 22.26634
----------------------------------------
The table structure shouldn't fool you.
This is a picture of a three-dimensional
array, not a matrix. It could be forced
into a matrix, but that's not the Stata way.
Correspondingly, if you look at what is left behind in
memory after -tabstat-, you have a series of matrices:
. ret li
macros:
r(name2) : "Foreign"
r(name1) : "Domestic"
matrices:
r(Stat2) : 2 x 3
r(Stat1) : 2 x 3
r(StatTotal) : 2 x 3
Looking at the matrices in turn,
. mat li r(Stat2)
r(Stat2)[2,3]
turn trunk length
mean 35.409091 11.409091 168.54545
sd 1.5010819 3.2169061 13.682548
. mat li r(Stat1)
r(Stat1)[2,3]
turn trunk length
mean 41.442308 14.75 196.13462
sd 3.9675817 4.3062882 20.046054
. mat li r(StatTotal)
r(StatTotal)[2,3]
turn trunk length
mean 39.648649 13.756757 187.93243
sd 4.3993537 4.2774042 22.26634
you will see that a program would need to
pick up these elements and combine them
in some way.
I do not know if you are a competent Stata programmer,
but a glance at -latabstat- suggests to me
that converting that into what you want would require one.
I wouldn't start from there. Consider this:
. collapse (mean) turnmean=turn trunkmean=trunk lengthmean =length
(sd) turnsd=turn trunksd=trunk lengthsd=length, by(foreign)
. l
+--------------------------------------------------------------------------+
| foreign turnmean trunkm~n length~n turnsd trunksd lengthsd |
|--------------------------------------------------------------------------|
1. | Domestic 41.4423 14.75 196.135 3.96758 4.30629 20.0461 |
2. | Foreign 35.4091 11.4091 168.545 1.50108 3.21691 13.6825 |
+--------------------------------------------------------------------------+
. reshape long turn trunk length, i(foreign) string
(note: j = mean sd)
Data wide -> long
-----------------------------------------------------------------------------
Number of obs. 2 -> 4
Number of variables 7 -> 5
j variable (2 values) -> _j
xij variables:
turnmean turnsd -> turn
trunkmean trunksd -> trunk
lengthmean lengthsd -> length
-----------------------------------------------------------------------------
. l
+-----------------------------------------------+
| foreign _j turn trunk length |
|-----------------------------------------------|
1. | Domestic mean 41.4423 14.75 196.135 |
2. | Domestic sd 3.96758 4.30629 20.0461 |
3. | Foreign mean 35.4091 11.4091 168.545 |
4. | Foreign sd 1.50108 3.21691 13.6825 |
+-----------------------------------------------+
You now have a dataset of similar form. So, let's see
how far we get if we automate this:
*! NJC 1.0.0 31 March 2006
program mytabstatexport
version 8.2
syntax varlist(numeric) [if] [in] using, ///
BY(varname) Statistics(str) [ * ]
marksample touse
markout `touse' `by', strok
qui count if `touse'
if r(N) == 0 error 2000
preserve
foreach s of local statistics {
local call "`call'(`s') "
foreach v of local varlist {
local call "`call'`v'`s' = `v' "
}
}
qui collapse `call', by(`by')
qui reshape long `varlist', i(`by') j(statistics) string
list
outfile `using', `options'
end
. mytabstatexport turn trunk length using test.txt , by(foreign) s(mean sd) comma
+---------------------------------------------------+
| foreign statis~s turn trunk length |
|---------------------------------------------------|
1. | Domestic mean 22.7049 9.52814 108.09 |
2. | Domestic sd 26.4986 7.38482 124.513 |
3. | Foreign mean 18.4551 7.313 91.114 |
4. | Foreign sd 23.9766 5.79275 109.505 |
+---------------------------------------------------+
. type test.txt
"Domestic","mean",22.7049,9.52814,108.09
"Domestic","sd",26.4986,7.38482,124.513
"Foreign","mean",18.4551,7.313,91.114
"Foreign","sd",23.9766,5.79275,109.505
As the -comma- exemplifies, -mytabstatexport- takes -outfile-
options. -statistics()- has to contain things that -collapse-
understands.
My guess based on this fooling around and some previous
experience is that:
1. -tabstat- is excellent at what it does, but trying to
pick up its results is a bit of a pain.
2. -collapse-, -reshape- and -outfile- offer excellent
building blocks for your own alternative.
3. As yet, -mytabstatexport- does not do statistics
for totals. That could be done, with a little pain,
I guess.
Nick
n.j.cox@durham.ac.uk
Venable
> I use Stata 8.2 for Windows. I would like to export results from
> tabstat to .txt files. The format in the Stata Results window, with
> the statistics "stacked" over each other, is almost perfect (if I
> could put standard deviations in parentheses, that would be perfect),
> but copying and pasting is time-consuming and could lead to errors,
> plus it makes replication difficult.
>
> The sort of tabstat command I would like to export is
>
> tabstat wheat corn barley, stat(mean sd) by(year) save
>
> I have tried -statsmat- and -tabstatmat-, but neither seems to work
> quite right for what I want to do:
>
> -statsmat-: using statsmat as a substitute for tabstat is problematic,
> because "by(byvar) is allowed only with a single varname in varlist".
> Furthermore, even when you use just one varname, the different
> statistics (mean sd in my example) are displayed in columns rather
> than one beneath the other (as in tabstat).
>
> -tabstatmat-: this seems to have similar issues - I get a
> "conformability error" with multiple varnames in the varlist; with
> just one varname in the varlist, the statistics are returned in
> separate columns rather than one under the other (this occurs whether
> or not I specify columns(variables) in the tabstat command)
>
> I have also tried the -tablemat- command but this allows only
> one statistic.
>
> The closest thing for what I would like to do seems to be -latabstat-,
> which exports beautifully to LaTeX tables - is there something similar
> for text files?
>
> Does anyone have any advice? I apologize if this has been discussed
> before - I did a quick search of the Statalist archives but did not
> find anything addressing these points. Please let me know if I have
> overlooked something.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/