Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: combining tables

From	Rebecca Pope <[email protected]>
To	[email protected]
Subject	Re: st: combining tables
Date	Mon, 14 Jan 2013 11:03:23 -0600
All,
This is an addendum to my post from November 6, 2012. When I
originally asked my question, John Luke Gallup's -outreg- was
suggested as a possible solution. I was not able to get it to work in
isolation. However, John recently published an article in The Stata
Journal (1) about -frmttable-, which he terms the "engine" of
-outreg-. Combining -frmttable- and -outreg-, I was able to create
exactly the sort of table I wanted without reducing my data to what
Roger Newson terms a "resultsset" (2) or taxing my limited RTF
capabilities. For those Stata users who, like me, work in an MS
Word-dominated world, I hope that this is helpful.

In my field of medical and health services research, articles often
include a description of the study population and as often as not the
statistics are a mix of percentages and means with standard deviations
"stacked" in columns representing different treatment, diagnosis, etc
groups. A good example of such a table that is freely available is
linked in the post below.

To create such a table using -frmttable- and -outreg- requires using
the annotate() & asymbol() options to add percent signs and closing
parentheses to the appropriate places and the use of doubles() & the
"obscure" dbldiv() option to place the standard deviations next to
rather than beneath the mean. Different groups may be combined with
the use of replay(), merge(), and append(). The code, applied rather
trivially to the auto data, appears below. I'm not claiming this
example code is the "best" method, especially w.r.t. hardcoding row
names & number of rows, but it should illustrate the point. I highly
recommend John's article and the help for -frmttable- for those
seeking to create a similar table in Word.

*** Code ***
sysuse auto.dta, clear
mata: mata clear
label def reprec 1 "Poor" 2 "Fair" 3 "Good" 4 "Very Good" 5
"Excellent" 9 "Missing"
replace rep78=9 if missing(rep78)
label val rep78 reprec
qui {
 tabulate rep78, generate(rec)
  mat sumstat = J(`r(r)'+1,1,.)
  mat pcts = 0 \ J(`r(r)',1,1)
  unab reprec: rec?
 foreach f in 0 1 {
  local i=2
  foreach rec of local reprec {
   local lbl: var label `rec'
   local lbl = subinstr("`lbl'","rep78==","",.)
   label var `rec' "   `lbl'"
   sum `rec' if foreign==`f', meanonly
   mat sumstat[`i',1] = r(mean)*100
   local i = `i'+1
  }
  matrix rownames sumstat = rep78 `reprec'
  frmttable, statmat(sumstat) replace ///
   varlabels sdec(1) annotate(pcts) asymbol("%") merge(col`f')
 }
}
mat dmat=(0,1,0,1)
mat summstat = J(6,4,.)
foreach f in 0 1 {
 local i = 2
 foreach v in length weight headroom {
  qui summarize `v' if foreign==`f'
  tempvar `v'
  clonevar ``v'' = `v'
  label var ``v'' "   Average `v' (s.d.)"
  mat summstat[`i',1] = r(mean)
  mat summstat[`i',2] = r(sd)
  local i = `i'+2
 }
 matrix rownames summstat = length `length' weight `weight' headroom `headroom'
 matrix pars = (0 \ 1 \ 0 \ 1 \ 0 \ 1 \ 0 \ 1 \ 0 \ 1 \ 0 \ 1)
 frmttable, statmat(summstat) varlabels doubles(dmat) sdec(1) ///
  dbldiv(" (") annotate(pars) asymbol(")") merge(colc`f')
}
outreg, replay(col0) append(colc0) store(sum0)
outreg, replay(col1) append(colc1) store(sum1)
outreg, replay(sum0) merge(sum1) ///
 title("Table 1. Sample Descriptive Statistics") ///
 ctitles("" , "Domestic", "", "Foreign" \ "Variable" , "N=52", "", "N=22")

***end example***

Citations:
(1) Gallup, JL. (2012) "A programmer's command to build formatted
statistical tables". The Stata Journal. 12(4):655-673. Available from:
http://www.stata-journal.com/article.html?article=sg97_5

(2) Newson, RB. (2012) "From resultssets to resultstables in Stata".
The Stata Journal. 12(2):191-213. Available from:
http://www.stata-journal.com/article.html?article=st0254

Rebecca

On Tue, Nov 6, 2012 at 11:20 AM, Rebecca Pope <[email protected]> wrote:
> Thanks again to Nick, Daniel K., and Roger for Stata suggestions. Here
> is a summary of my experience with your proposed solutions. It is
> pretty dense, but I've tried to be comprehensive in case anyone else
> runs into a similar situation since I would say on balance that all
> approaches had their relative strengths & weaknesses that different
> users might weight differently than I do.
>
> Nick, you were right about -tabout- coming closest to what I want to
> accomplish in terms of a single command. Particular strengths in my
> context are the ability to have it treat the list of supplied
> variables as -tab1- would, rather than producing cross-tabulations or
> grouping over all possible combinations of the variables in the list.
> This was the problem I ran into with -collapse-. The survey extensions
> are also very helpful and, while not applicable in this particular
> context, often are in my work. If I'm working with strictly
> categorical variables, this would be my command of choice. The
> limitation that I have encountered is that I cannot combine continuous
> and categorical variables into one table with mean (sd) for continuous
> & N (%) for categorical. For an example, see Table 1 of Pyne et al's
> How Bad is Depression? Available at
> <http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2739035/pdf/hesr0044-1406.pdf>,
> which is a good representation of the sort of table I routinely make
> (I can't claim credit for this one, but it is from my group & freely
> accessible).
>
> Next, to Daniel Klein's suggestion about -outreg-. Yes, I know about
> John Gallup's -outreg-, but I have almost exclusively used Ben Jann's
> -estout- for regression output (Jann B. (2005) Making regression
> tables from stored estimates. Stata Journal 5(3): 288–308). I'm not
> going to make any claims about one's superiority over the other, just
> the relative benefits of familiarity. However, the suggestion about
> -outreg-'s extended capabilities prompted me to look a little closer
> to home and think more creatively about using -estout- since I'm more
> familiar with it. This worked moderately well. The ability to set
> numeric formats cell-by-cell is especially nice. Unfortunately, as far
> as I can tell, you also lose a lot of options when your matrix is
> user-created rather than by Stata-stored estimation results. For
> example, it does not appear that you can add parentheses or place one
> element under another. All I can say for certain is that I couldn't
> when I tried. After this immediate project, I'll look closer at
> -outreg- and potentially writing a program that saves the matrix in
> r(), which if I'm reading the -estout- documentation correctly should
> restore some of the functionality.
>
> Then there is the suite of commands that appear in Roger's article...
> If I'd known what I was getting into, I would have waited on this past
> my present project too, but once started, I was too stubborn to stop.
> The level of control is pretty spectacular and it seems, thus far,
> that the main limitations of Roger's approach are the user's & RTF
> capabilities (for those who must work in MS Word, etc). Depending on
> the the user, this could be quite significant. Two days of trial and
> error and another full day spent on many additional readings and I
> _finally_ managed to get the RTF output to work correctly. In
> fairness, Roger's article does disclose this: "RTF tables are less
> simple to produce..." I have a much greater appreciation now for the
> commands that do this automatically. In the end though, I wound up
> with a table that only requires right indents for the data columns,
> the addition of top & bottom borders, and an empty row (personal
> preference) above each "gap row" (see -gaprow- in Roger's article).
> This is substantially less formatting than I did before, so I'm quite
> happy. Still, I'd say the initial time investment to understand all
> the intermediate steps was quite high & creating the "resultssets" is
> going to require project-specific code rather than invoking a single
> command. Given the time that it takes to paste in values & format
> tables (especially large ones) in Word, I nevertheless think it is
> worth it.
>
> Once again, thank you to all. I've learned a lot from this exercise.
>
> Regards,
> Rebecca
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
Follow-Ups:
- Re: st: combining tables
  - From: Rebecca Pope <[email protected]>
Prev by Date: Re: st: Calculating and interpreting effect size when DV is a proportion
Next by Date: Re: st: Calculating and interpreting effect size when DV is a proportion
Previous by thread: st: Slowing process when running a program with multiple nested loops
Next by thread: Re: st: combining tables
Index(es):
- Date
- Thread