Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Re: Re: st: Tabulate summary statistics by percentiles and save output


From   annoporci <annoporci@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: Re: Re: st: Tabulate summary statistics by percentiles and save output
Date   Sun, 30 Dec 2012 05:35:21 +0800

I wish to tabulate some summary statistics for some percentiles and to
export the tables to files in tex format.

It turns out that my tabulations had serious problems, caused by a
misunderstanding of the Stata syntax, which Nick kindly pointed out. For
the record, I copy below the code which, I think, achieves the first of my
objectives, namely summary statistics for different percentiles. I'm still
working on exporting that in latex tables.


   clear
   cap log close
   set more off
   cd c:\stata\

   use ibm,clear
   tsset date
   local variables ibm ///spx

   preserve

   /* Tabulate moments for different percentiles */
   /// summarize produces only a few selected percentiles
   foreach var of varlist `variables' {
     quietly summarize `var', detail
     summarize `var' if inrange(`var',`=r(p1)',`=r(p10)'), detail
     quietly summarize `var', detail
     summarize `var' if inrange(`var',`=r(p90)',`=r(p100)'), detail
   }

    /* Tabulate moments for different percentiles */
    /// uses the percentiles computed from the previous subset of data used
by summarize
    /// NOTE: most likely not what is intended
   foreach var of varlist `variables' {
     quietly summarize `var', detail
     summarize `var' if inrange(`var',`=r(p1)',`=r(p10)'), detail
     summarize `var' if inrange(`var',`=r(p90)',`=r(p100)'), detail
   }

   restore

   /* Tabulate moments for different percentiles */
   /// Alternative approach using centile and tabstat
   /// if percentiles beyond those returned by summarize are needed

    /* Compute and store percentiles */
   foreach var of varlist `variables' {
     quietly centile `var', centile(1(1)100) normal
     forval i = 1(1)100 {
       scalar `var'_p`i' = r(c_`i')
     }
   }


   /* Compute and store first moments between pi(i=1..100) and p100 */
   foreach var of varlist `variables' {
     forvalues i = 1(1)99 {
quietly tabstat `var' if inrange(`var',`=`var'_p`i'',`=`var'_p100') ///
                     , stat(count mean sd skewness kurtosis) save
     *return list
     tempname total
     matrix `total' = r(StatTotal)
     *matrix list `total'
     scalar `var'_ob_p`i'_p100 = `total'[1,1]
     scalar `var'_mu_p`i'_p100 = `total'[2,1]
     scalar `var'_sd_p`i'_p100 = `total'[3,1]
     scalar `var'_sk_p`i'_p100 = `total'[4,1]
     scalar `var'_kt_p`i'_p100 = `total'[5,1]
     }
   }

   scalar list

Remark: I saved my variables of interest as scalars, not sure if that's
the smart way.

Obviously, in practice, I do not intend to do so many computations for so
many percentiles, the above is merely an illustration of what's possible
with my current skill level. The reason for using -tabstat- with -centile-
is that -summarize,detail- returns only a selection of percentiles, not
enough for my purpose.

The code I wrote computes for p1-p100, then p2-p100, p3-p100, etc. which
is useful for my ultimate purpose. Other ways based on the same code are
obviously possible, e.g. p49-p51, p48-p52, p47-p53, etc..

I am posting this as a form of follow-up, for the record only, and with no
guarantee, I have essentially no experience and competence in Stata and
statistics. I hope this is not against the statalist etiquette.

--
Patrick Toche.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index