Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Eric Booth <ebooth@ppri.tamu.edu> |
To | "<statalist@hsphsun2.harvard.edu>" <statalist@hsphsun2.harvard.edu> |
Subject | Re: st: Is -collapse- the Stata's fastest routine to summarize data sets? |
Date | Fri, 9 Jul 2010 14:26:15 +0000 |
<> If you want to collapse by several categorical vars with -tabout- it's not as straightforward as with -collapse-. You can create a single variable that is an indicator of all possible combinations of the n categorical variables and then -tabout- by that combined indicator. For example, ******************! clear sysuse auto cap which tabout if _rc ssc install tabout **create n categorical vars** recode rep78 (.=0) lab def rep78 1 "one" 2 "two" 3 "three" 4 "four" 5 "five" 0 "zero/miss", modify lab val rep78 rep78 egen price2 = cut(price), group(4) label drop price // 1. collapse ds make rep78 for price2, not local vars `r(varlist)' ** preserve collapse (sum) `vars' , by(rep78 price2 foreign) outsheet using collapsed.csv, comma replace restore // 2. tabout local vars: subinstr local vars " " " sum ", all di "`vars'" ** tabout rep78 price2 foreign using taboutex.csv, replace sum c(sum `vars') style(csv) h2(THIS ISN'T WHAT YOU WANT |) preserve **decode your categorical vars** foreach v in rep78 price2 foreign { decode `v', g(`v'a) drop `v' rename `v'a `v' } **combine your categorical vars into one var** g categories = price2 + rep78 + " - " + foreign ta categories ** tabout categories using taboutex.csv, append sum c(sum `vars') h2(THIS IS WHAT YOU WANT|) lines(double) style(csv) restore ******************! ~ Eric __ Eric A. Booth Public Policy Research Institute Texas A&M University ebooth@ppri.tamu.edu Office: +979.845.6754 On Jul 8, 2010, at 6:47 PM, Tiago V. Pereira wrote: > Many, many thanks Eric! > > Yes, -tabout- really seems to be much faster than -collapse-. However, I > could not figure out how to make it work when one has n categorical > variables, and wants to summarize continous variables taking all possible > combinations of the n categorical variables. > > -collapse- does that using the by() option. > > Thanks again! > > Tiago > > > > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/