Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: summary statistics

From	Nick Cox <[email protected]>
To	[email protected]
Subject	Re: st: summary statistics
Date	Mon, 1 Oct 2012 17:33:15 +0100

Actually, I pointed to _two_ user-written commands that do this.

In general, I agree with Clyde. The trade-off between doing it
yourself from first principles and finding a suitable user-written
command (or even an official one) is delicate, and always with a user.
However, the success of Clyde's approach depends partly on his being
an experienced user who is fluent in, and feels comfortable with, much
of Stata.

With a structure this simple, a two-line solution is also competitive.

stack HSS*, into(HSS) clear
tab _stack HSS

-- although that loses some detail on variable names and labels, and
so does not qualify as a good solution by itself.

However, the full trade-off needs to take account of various awkward facts:

1. You might want to do this repeatedly.

2. You might (almost certainly will) want to go back to your original
data structure.

3. You might want to carry weights through the -reshape- too.

As said, I am in agreement, just spelling out some issues.

Nick

On Mon, Oct 1, 2012 at 5:14 PM, Clyde B Schechter
<[email protected]> wrote:
> Don Spady was looking for a command that would take variables HSS1-HSS18, each with a discrete 1 to 5 response set and create a table like:
>
>                 Col 1    Col2     Col 3    Col 4   Col5
> HSS1        n1            n2            n3             n4           n5
> HSS2        n1            n2            n3             n4           n5
> HSS3
>
> And Nick Cox pointed him to a user-written command that does this.
>
> I would just add that this can also be easily done using a few built-in Stata commands:
>
> (I assume there is another variable, called id, which identifies the observations.  If not, it can be generated first)
>
> reshape long HSS, i(id) j(varnum)
> collapse (count) Col = id, by(varnum HSS)
> reshape wide Col, i(varnum) j(HSS)
> gen variable= "HSS"+string(varnum)
> list variable Col*, noobs clean
>
>
> I suppose it is a matter of taste which way to do these things.  In general, if it is something I do repeatedly, I find the convenience of a single command (which I might write an ado file for myself) worthwhile.  But if it's a one-off, it's generally faster to write a few lines of code and also not later be bothered with trying to remember what some unfamiliar command name means.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: summary statistics
  - From: Donald Spady <[email protected]>

References:
- st: summary statistics
  - From: Clyde B Schechter <[email protected]>

Prev by Date: st: summary statistics
Next by Date: Re: st: Matching samples in Stata
Previous by thread: st: summary statistics
Next by thread: Re: st: summary statistics
Index(es):
- Date
- Thread