Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: summary statistics


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: summary statistics
Date   Mon, 1 Oct 2012 17:33:15 +0100

Actually, I pointed to _two_ user-written commands that do this.

In general, I agree with Clyde. The trade-off between doing it
yourself from first principles and finding a suitable user-written
command (or even an official one) is delicate, and always with a user.
However, the success of Clyde's approach depends partly on his being
an experienced user who is fluent in, and feels comfortable with, much
of Stata.

With a structure this simple, a two-line solution is also competitive.

stack HSS*, into(HSS) clear
tab _stack HSS

-- although that loses some detail on variable names and labels, and
so does not qualify as a good solution by itself.

However, the full trade-off needs to take account of various awkward facts:

1. You might want to do this repeatedly.

2. You might (almost certainly will) want to go back to your original
data structure.

3. You might want to carry weights through the -reshape- too.

As said, I am in agreement, just spelling out some issues.

Nick

On Mon, Oct 1, 2012 at 5:14 PM, Clyde B Schechter
<clyde.schechter@einstein.yu.edu> wrote:
> Don Spady was looking for a command that would take variables HSS1-HSS18, each with a discrete 1 to 5 response set and create a table like:
>
>                 Col 1    Col2     Col 3    Col 4   Col5
> HSS1        n1            n2            n3             n4           n5
> HSS2        n1            n2            n3             n4           n5
> HSS3
>
> And Nick Cox pointed him to a user-written command that does this.
>
> I would just add that this can also be easily done using a few built-in Stata commands:
>
> (I assume there is another variable, called id, which identifies the observations.  If not, it can be generated first)
>
> reshape long HSS, i(id) j(varnum)
> collapse (count) Col = id, by(varnum HSS)
> reshape wide Col, i(varnum) j(HSS)
> gen variable= "HSS"+string(varnum)
> list variable Col*, noobs clean
>
>
> I suppose it is a matter of taste which way to do these things.  In general, if it is something I do repeatedly, I find the convenience of a single command (which I might write an ado file for myself) worthwhile.  But if it's a one-off, it's generally faster to write a few lines of code and also not later be bothered with trying to remember what some unfamiliar command name means.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index