Home  /  Resources & support  /  FAQs  /  Saving stats into a dataset or matrices

How can I create a dataset (matrix) of means (other stats) of variables from the current dataset?

Title   Saving stats (means, standard deviations, etc.) into a dataset or matrices
Author Ronna Cong, StataCorp

collapse converts the data in memory into a dataset of means, medians, etc.

Example using collapse:

 . use auto, clear
 (1978 Automobile Data)
     
 . by foreign: sum price mpg weight [aweight=rep78]
     
 _______________________________________________________________________________
 -> foreign = Domestic
 
     Variable |     Obs      Weight        Mean   Std. Dev.       Min        Max
 -------------+-----------------------------------------------------------------
        price |      48    145.0000    6162.517   3106.007       3291      15906
          mpg |      48    145.0000        19.8   5.205471         12         34
       weight |      48    145.0000    3347.862   740.8696       1800       4840
 
 _______________________________________________________________________________
 -> foreign = Foreign
 
     Variable |     Obs      Weight        Mean   Std. Dev.       Min        Max
    -------------+-----------------------------------------------------------------
     price |      21     90.0000    6133.778   2286.096       3748      11995
          mpg |      21     90.0000    25.45556   6.719655         17         41
       weight |      21     90.0000    2285.778   371.6942       1760       3170
 
     
 . collapse (mean) price_mean = price (median) mpg_med = mpg (sd) weight_sd = 
   weight [aweight=rep], by(foreign)
 
 . list

    +------------------------------------------+
    |  foreign   price_~n   mpg_med   weight~d |
    |------------------------------------------|
 1. | Domestic    6,162.5        19     740.87 |
 2. |  Foreign    6,133.8        25    371.694 |
    +------------------------------------------+

matrix accum, with the means() and deviations options, can be used to obtain means matrices and covariance matrices.

Example using matrix accum:

 . use auto, clear
 (1978 Automobile Data)
     
 . correlate price mpg weight [aweight=rep78], means cov
 (sum of wgt is   2.3500e+02)
 (obs=69)
     
     Variable |         Mean    Std. Dev.          Min          Max
 -------------+----------------------------------------------------
        price |     6,151.51     2,801.56        3,291       15,906
          mpg |     21.96596     6.402573           12           41
       weight |     2,941.11     811.2383        1,760        4,840
 
 
              |    price      mpg   weight
 -------------+---------------------------
        price |  7.8e+06
          mpg | -8225.94  40.9929
       weight |  1.2e+06 -4123.77   658108
 
 
 . mat accum Cov = price mpg weight [aweight=rep78], noc means(M) deviations
 (sum of wgt is   2.3500e+02)
 (obs=69)
 
 . mat list M
 
 M[1,3]
            price        mpg     weight
 _cons  6151.5106  21.965957  2941.1064
 
 . mat Cov = Cov/(r(N)-1)
     
 . mat list Cov
 
 symmetric Cov[3,3]
          price         mpg      weight
  price   7848734.6
    mpg  -8225.9352   40.992942
 weight   1222727.7  -4123.7697   658107.63