Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: -firstdigit- available from SSC


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: -firstdigit- available from SSC
Date   Wed, 23 May 2007 17:03:16 +0100

Thanks to Kit Baum, a new package -firstdigit- 
is now available from SSC. Stata 9 is required,
as the program depends on Mata. Use -ssc-
to install if interested. 

-firstdigit- tabulates and analyses the first 
digits of numeric variables.  It also tests
Benford's law that digits d = 1,..,9 occur 
with probabilities log10(1 + 1/d). Thus given
data of 12, 345, 6789, etc., it would extract 
1, 3, 6, etc., tabulate the frequencies of
the digits 1 to 9 and give a chi-square test 
of the law. 

Users of Stata 8 may wish to look at -benford-
by Nikos Askitas, also available from SSC
(and revised today). 

Alternatively, users of Stata 8 may use -chitest- 
from the package -tab_chi-, also available from 
SSC, for this purpose. The help details a Benford's
Law example. 

Mata users may be interested to see how the main
work goes in Mata: 

void fd_work(string scalar varname, 
             string scalar tousename, 
             string scalar percent) 
{ 
        real colvector y, obs, exp 
        real scalar n, i, chisq   
        string scalar name 

        y = st_data(., varname, tousename)    
        n = rows(y) 
        exp = obs = J(9, 1, 0) 

        y = strtoreal(substr(strofreal(y), 1, 1))

        for (i = 1; i <= 9; i++) {
                obs[i] = colsum(y :== i) 
                exp[i] = n * log10(1 + 1/i)
                name = "r(obs" + strofreal(i) + ")"
                st_numscalar(name, 
                        percent == "" ? obs[i] : 100 * obs[i] / n) 
                name = "r(exp" + strofreal(i) + ")"
                st_numscalar(name, 
                        percent == "" ? exp[i] : 100 * log10(1 + 1/i)) 
        } 

        chisq = colsum(((obs - exp):^2) :/ exp) 
        st_numscalar("r(p)", chi2tail(8, chisq)) 
        st_numscalar("r(chisq)", chisq)
        st_numscalar("r(N)", n)
}       

Nick 
n.j.cox@durham.ac.uk 

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index