[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

# st: -moments- available from SSC

 From "Nick Cox" To Subject st: -moments- available from SSC Date Tue, 28 Sep 2004 10:41:29 +0100

```Thanks to Kit Baum, a program -moments- is now
available in a package of the same name from
SSC. Stata 8.2 is required.

-moments- has been mentioned a couple of times
in recent postings to Statalist. The point was
made that if you are doing something like -sktest-,
you should also look at the skewness and kurtosis
(a graph too, naturally).

-moments- calculates number of observations, mean,
standard deviation, skewness and kurtosis for a list
of variables.

Your reaction to that is likely to be one or both of
two things:

(1) Surely -summarize- does that already.

(2) Surely -tabstat- is already available for customised
tables of summary statistics.

If you thought that, you are correct. The merits
of -moments- are purely matters of convenience or presentation.

-summarize- produces these measures, but together with
a lot of other stuff:

. su price, detail

Price
-------------------------------------------------------------
Percentiles      Smallest
1%         3291           3291
5%         3748           3299
10%         3895           3667       Obs                  74
25%         4195           3748       Sum of Wgt.          74

50%       5006.5                      Mean           6165.257
Largest       Std. Dev.      2949.496
75%         6342          13466
90%        11385          13594       Variance        8699526
95%        13466          14500       Skewness       1.653434
99%        15906          15906       Kurtosis       4.819188

-tabstat- is the obvious answer to that problem.

. tabstat price-foreign, c(s) s(n mean sd skew kurt)

variable |         N      mean        sd  skewness  kurtosis
-------------+--------------------------------------------------
price |        74  6165.257  2949.496  1.653434  4.819188
mpg |        74   21.2973  5.785503  .9487176  3.975005
rep78 |        69  3.405797  .9899323 -.0570331  2.678086
headroom |        74  2.993243  .8459948  .1408651  2.208453
trunk |        74  13.75676  4.277404  .0292034  2.192052
weight |        74  3019.459  777.1936  .1481164  2.118403
length |        74  187.9324  22.26634 -.0409746   2.04156
turn |        74  39.64865  4.399354  .1238259  2.229458
displacement |        74  197.2973  91.83722  .5916565  2.375577
gear_ratio |        74  3.014865  .4562871  .2191658  2.101812
foreign |        74  .2972973  .4601885  .8869686  1.786713
----------------------------------------------------------------

When I see a table like that, I want fewer decimal places. I
tend to go for 3, and on some criteria that is way too many:

. tabstat price-foreign, c(s) s(n mean sd skew kurt)  format(%4.3f)

variable |         N      mean        sd  skewness  kurtosis
-------------+--------------------------------------------------
price |    74.000  6165.257  2949.496     1.653     4.819
mpg |    74.000    21.297     5.786     0.949     3.975
rep78 |    69.000     3.406     0.990    -0.057     2.678
headroom |    74.000     2.993     0.846     0.141     2.208
trunk |    74.000    13.757     4.277     0.029     2.192
weight |    74.000  3019.459   777.194     0.148     2.118
length |    74.000   187.932    22.266    -0.041     2.042
turn |    74.000    39.649     4.399     0.124     2.229
displacement |    74.000   197.297    91.837     0.592     2.376
gear_ratio |    74.000     3.015     0.456     0.219     2.102
foreign |    74.000     0.297     0.460     0.887     1.787
----------------------------------------------------------------

That is clearly better, but some small details are irritating.

1. If I use a non-default -format()-, I get it everywhere. (My
punishment is that I got what I asked for.) In the case of
number of observations, this looks a little silly. As -tabstat-
accepts at most frequency or analytical weights, that column N
is always going to contain integers. I've previously suggested
that -tabstat- be modified to ignore -format()- in the case of N,
but to no effect.

2. That's the only control over small details of
presentation that you get. (You can transpose the table, which
is on occasion very useful.)

The default output of -moments- is like this:

. moments

-----------------------------------------------------------------------
n = 69 |       mean          SD    skewness    kurtosis
-----------------------+-----------------------------------------------
Price |   6146.043    2912.440       1.688       5.032
Mileage (mpg) |     21.290       5.866       0.995       3.997
Repair Record 1978 |      3.406       0.990      -0.057       2.678
Headroom (in.) |      3.000       0.853       0.197       2.144
Trunk space (cu. ft.) |     13.928       4.343      -0.044       2.159
Weight (lbs.) |   3032.029     792.851       0.118       2.073
Length (in.) |    188.290      22.747      -0.076       2.000
Turn Circle (ft.) |     39.797       4.441       0.071       2.228
Displacement (cu. in.) |    198.000      93.148       0.581       2.354
Gear Ratio |      2.999       0.463       0.279       2.109
Car type |      0.304       0.464       0.850       1.723
-----------------------------------------------------------------------

The default is now %9.3f. Well, I like that.

Also, by default casewise deletion is used: statistics are computed for
the sample that is not missing for any of the variables.  The constant
n = 69 can thus be tucked away in a corner. That's the other way
round from -summarize- or -tabstat-. Naturally, you can get the opposite
behaviour if you wish:

. moments, allobs

-----------------------------------------------------------------------------------
Variable |          n        mean          SD    skewness    kurtosis
-----------------------+-----------------------------------------------------------
Price |         74    6165.257    2949.496       1.653       4.819
Mileage (mpg) |         74      21.297       5.786       0.949       3.975
Repair Record 1978 |         69       3.406       0.990      -0.057       2.678
Headroom (in.) |         74       2.993       0.846       0.141       2.208
Trunk space (cu. ft.) |         74      13.757       4.277       0.029       2.192
Weight (lbs.) |         74    3019.459     777.194       0.148       2.118
Length (in.) |         74     187.932      22.266      -0.041       2.042
Turn Circle (ft.) |         74      39.649       4.399       0.124       2.229
Displacement (cu. in.) |         74     197.297      91.837       0.592       2.376
Gear Ratio |         74       3.015       0.456       0.219       2.102
Car type |         74       0.297       0.460       0.887       1.787
-----------------------------------------------------------------------------------

The number of observations remains shown as an integer. You can specify up
to four numeric formats, to control display of
mean (standard deviation (skewness (kurtosis))).

. moments, format(%2.1f %2.1f)

-----------------------------------------------------------------------
n = 69 |       mean          SD    skewness    kurtosis
-----------------------+-----------------------------------------------
Price |     6146.0      2912.4       1.688       5.032
Mileage (mpg) |       21.3         5.9       0.995       3.997
Repair Record 1978 |        3.4         1.0      -0.057       2.678
Headroom (in.) |        3.0         0.9       0.197       2.144
Trunk space (cu. ft.) |       13.9         4.3      -0.044       2.159
Weight (lbs.) |     3032.0       792.9       0.118       2.073
Length (in.) |      188.3        22.7      -0.076       2.000
Turn Circle (ft.) |       39.8         4.4       0.071       2.228
Displacement (cu. in.) |      198.0        93.1       0.581       2.354
Gear Ratio |        3.0         0.5       0.279       2.109
Car type |        0.3         0.5       0.850       1.723
-----------------------------------------------------------------------

You'll notice the variable labels, shown by default. You can override
that too:

. moments, format(%2.1f %2.1f)  variablenames

-------------------------------------------------------------
n = 69 |       mean          SD    skewness    kurtosis
-------------+-----------------------------------------------
price |     6146.0      2912.4       1.688       5.032
mpg |       21.3         5.9       0.995       3.997
rep78 |        3.4         1.0      -0.057       2.678
headroom |        3.0         0.9       0.197       2.144
trunk |       13.9         4.3      -0.044       2.159
weight |     3032.0       792.9       0.118       2.073
length |      188.3        22.7      -0.076       2.000
turn |       39.8         4.4       0.071       2.228
displacement |      198.0        93.1       0.581       2.354
gear_ratio |        3.0         0.5       0.279       2.109
foreign |        0.3         0.5       0.850       1.723
-------------------------------------------------------------

-moments- is also just smart enough to filter out any string variables
fed to it, rather than choking on them (-tabstat-) or giving a line
of output flagging 0 observations (-summarize-).

There are some other features too, but that's enough on -moments-.

Nick
n.j.cox@durham.ac.uk

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```

 © Copyright 1996–2015 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index