Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Exporting tabstat results to .txt (or other format) files


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: Exporting tabstat results to .txt (or other format) files
Date   Fri, 31 Mar 2006 00:33:29 +0100

Your example command has the form 

	two or more variables 

BY 

	one or more statistics

BY

	categories of a variable. 

These requests are, logically, for several matrices, 
not one. 

Your assertion that -statsmat- and -tabstatmat- 
have "issues" -- as if they were in need of counselling
or psychoanalysis -- is correct. Both of these 
commands are, as is totally explicit in their help, 
designed to produce individual matrices. Their 
issue is this: they are unable to understand a request
for which they were not designed. I am sorry that the 
help was not clear enough for you to understand this, 
but the statement is there in both files. 

The point can be made better by looking at an example
similar to yours. 

. tabstat turn trunk length, stat(mean sd) by(foreign) save

Summary statistics: mean, sd
  by categories of: foreign (Car type)

 foreign |      turn     trunk    length
---------+------------------------------
Domestic |  41.44231     14.75  196.1346
         |  3.967582  4.306288  20.04605
---------+------------------------------
 Foreign |  35.40909  11.40909  168.5455
         |  1.501082  3.216906  13.68255
---------+------------------------------
   Total |  39.64865  13.75676  187.9324
         |  4.399354  4.277404  22.26634
----------------------------------------

The table structure shouldn't fool you. 
This is a picture of a three-dimensional
array, not a matrix. It could be forced
into a matrix, but that's not the Stata way. 

Correspondingly, if you look at what is left behind in 
memory after -tabstat-, you have a series of matrices: 

. ret li

macros:
             r(name2) : "Foreign"
             r(name1) : "Domestic"

matrices:
             r(Stat2) :  2 x 3
             r(Stat1) :  2 x 3
         r(StatTotal) :  2 x 3

Looking at the matrices in turn, 

. mat li r(Stat2) 

r(Stat2)[2,3]
           turn      trunk     length
mean  35.409091  11.409091  168.54545
  sd  1.5010819  3.2169061  13.682548

. mat li r(Stat1) 

r(Stat1)[2,3]
           turn      trunk     length
mean  41.442308      14.75  196.13462
  sd  3.9675817  4.3062882  20.046054

. mat li r(StatTotal) 

r(StatTotal)[2,3]
           turn      trunk     length
mean  39.648649  13.756757  187.93243
  sd  4.3993537  4.2774042   22.26634

you will see that a program would need to 
pick up these elements and combine them 
in some way. 

I do not know if you are a competent Stata programmer, 
but a glance at -latabstat- suggests to me 
that converting that into what you want would require one. 

I wouldn't start from there. Consider this: 

. collapse (mean) turnmean=turn trunkmean=trunk lengthmean =length 
(sd) turnsd=turn trunksd=trunk lengthsd=length,  by(foreign) 

. l

     +--------------------------------------------------------------------------+
     |  foreign   turnmean   trunkm~n   length~n    turnsd   trunksd   lengthsd |
     |--------------------------------------------------------------------------|
  1. | Domestic    41.4423      14.75    196.135   3.96758   4.30629    20.0461 |
  2. |  Foreign    35.4091    11.4091    168.545   1.50108   3.21691    13.6825 |
     +--------------------------------------------------------------------------+

. reshape long turn trunk length, i(foreign) string 
(note: j = mean sd)

Data                               wide   ->   long
-----------------------------------------------------------------------------
Number of obs.                        2   ->       4
Number of variables                   7   ->       5
j variable (2 values)                     ->   _j
xij variables:
                        turnmean turnsd   ->   turn
                      trunkmean trunksd   ->   trunk
                    lengthmean lengthsd   ->   length
-----------------------------------------------------------------------------

. l

     +-----------------------------------------------+
     |  foreign     _j      turn     trunk    length |
     |-----------------------------------------------|
  1. | Domestic   mean   41.4423     14.75   196.135 |
  2. | Domestic     sd   3.96758   4.30629   20.0461 |
  3. |  Foreign   mean   35.4091   11.4091   168.545 |
  4. |  Foreign     sd   1.50108   3.21691   13.6825 |
     +-----------------------------------------------+

You now have a dataset of similar form. So, let's see
how far we get if we automate this: 

*! NJC 1.0.0 31 March 2006 
program mytabstatexport
	version 8.2 
	syntax varlist(numeric) [if] [in] using, ///
	BY(varname) Statistics(str) [ * ]

	marksample touse 
	markout `touse' `by', strok 
	qui count if `touse' 
	if r(N) == 0 error 2000 

	preserve 
	foreach s of local statistics {
		local call "`call'(`s') " 
		foreach v of local varlist { 
			local call "`call'`v'`s' = `v' " 
		}
	} 

	qui collapse `call', by(`by') 
	qui reshape long `varlist', i(`by') j(statistics) string 

	list 
	outfile `using', `options' 
end 	

. mytabstatexport turn trunk length using test.txt , by(foreign) s(mean sd) comma 

     +---------------------------------------------------+
     |  foreign   statis~s      turn     trunk    length |
     |---------------------------------------------------|
  1. | Domestic       mean   22.7049   9.52814    108.09 |
  2. | Domestic         sd   26.4986   7.38482   124.513 |
  3. |  Foreign       mean   18.4551     7.313    91.114 |
  4. |  Foreign         sd   23.9766   5.79275   109.505 |
     +---------------------------------------------------+

. type test.txt
"Domestic","mean",22.7049,9.52814,108.09
"Domestic","sd",26.4986,7.38482,124.513
"Foreign","mean",18.4551,7.313,91.114
"Foreign","sd",23.9766,5.79275,109.505

As the -comma- exemplifies, -mytabstatexport- takes -outfile- 
options. -statistics()- has to contain things that -collapse-
understands. 

My guess based on this fooling around and some previous 
experience is that: 

1. -tabstat- is excellent at what it does, but trying to 
pick up its results is a bit of a pain. 

2. -collapse-, -reshape- and -outfile- offer excellent 
building blocks for your own alternative. 

3. As yet, -mytabstatexport- does not do statistics 
for totals. That could be done, with a little pain, 
I guess. 

Nick 
n.j.cox@durham.ac.uk 

Venable
 
> I use Stata 8.2 for Windows. I would like to export results from
> tabstat to .txt files. The format in the Stata Results window, with
> the statistics "stacked" over each other, is almost perfect (if I
> could put standard deviations in parentheses, that would be perfect),
> but copying and pasting is time-consuming and could lead to errors,
> plus it makes replication difficult.
> 
> The sort of tabstat command I would like to export is
> 
> tabstat wheat corn barley, stat(mean sd) by(year) save
> 
> I have tried -statsmat- and -tabstatmat-, but neither seems to work
> quite right for what I want to do:
> 
> -statsmat-: using statsmat as a substitute for tabstat is problematic,
> because "by(byvar) is allowed only with a single varname in varlist".
> Furthermore, even when you use just one varname, the different
> statistics (mean sd in my example) are displayed in columns rather
> than one beneath the other (as in tabstat).
> 
> -tabstatmat-: this seems to have similar issues - I get a
> "conformability error" with multiple varnames in the varlist; with
> just one varname in the varlist, the statistics are returned in
> separate columns rather than one under the other (this occurs whether
> or not I specify columns(variables) in the tabstat command)
> 
> I have also tried the -tablemat- command but this allows only 
> one statistic.
> 
> The closest thing for what I would like to do seems to be -latabstat-,
> which exports beautifully to LaTeX tables - is there something similar
> for text files?
> 
> Does anyone have any advice? I apologize if this has been discussed
> before - I did a quick search of the Statalist archives but did not
> find anything addressing these points. Please let me know if I have
> overlooked something.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2020 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index