Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# st: RE: Outputs as inputs - how to efficiently process a series of routines?

 From Nick Cox To "'statalist@hsphsun2.harvard.edu'" Subject st: RE: Outputs as inputs - how to efficiently process a series of routines? Date Tue, 31 Aug 2010 10:05:18 +0100

-statsby- is your friend. See the manual entry in [D] and also

SJ-10-1 gr0045  . . . . . . . . . . . . . Speaking Stata: The statsby strategy
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
Q1/10   SJ 10(1):143--151                                (no commands)
demonstrates the use of statsby to prepare a reduced
dataset for subsequent graphing

Nick
n.j.cox@durham.ac.uk

Philip Burgess

This is a data management question rather than a statistical issue....

I have a dataset which is stratified by 4 variables:

1. Sample - kids, adults, or older persons;
2. Year - 2006, 2007, 2008 or 2009;
3. Treatment setting - inpatient, residential or ambulatory;
4. Status - baseline, follow-up or change.

Thus, the overall structure is a 3 x 4 x 3 x 3 = 108 unique strata.

The outcome variable is usually a summary score and I need to estimate
various statistics (say mean, SD, percentiles) for each of the strata;
and I also need to estimate the internal consistency of the outcome
measure with Cronbach's Alpha.

I need to use the estimated statistic(s) as 'input' in a variety of
other calculations (e.g., calculate overall Effect Size using the
mean, the SD; other calculations require Alpha).

I know these statistics are available immediately after execution - I
can get these using the command - return list - . After that, I can
generate a new variable - gen double alpha = r(alpha)- and then run -
collapse (first) alpha - to get the required statistic(s).

The problem is that I have 108 strata and whereas I can 'manually'
code each of these variants (and save as temp files, then - append -
all 108 to save a single file - this is both inefficient and the risk
of error (i.e., me!) is high.

Is there a way around this?

I should add that I have mainly used SPSS for these kinds of data
management issues. Theoretically, using SPSS commands that 'split' the
data file by the required partitions and then using its Output
Management System will achieve the required output. This used to work
with earlier versions but not the current release - hence my efforts
with Stata.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/