Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: Fast way to calculate Gini coefficients


From   "Newson, Roger B" <r.newson@imperial.ac.uk>
To   "'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu>
Subject   st: RE: Fast way to calculate Gini coefficients
Date   Fri, 15 May 2009 16:07:11 +0100

A program you haven't mentioned is -somersd-, which can also be used to calculate Gini coefficients, and can be downloaded from SSC. To do this in a Stata session, type

ssc desc somersd

for a brief description, and

ssc install somersd, replace

to install the package, and

net get somersd

to copy the 3 .pdf manuals for the -somersd- package to your current local folder. The manual -somersd.pdf- contains an example of the use of -somersd- for calculating a Gini coefficient. This example also appears in Newson (2006a).

I do not know whether -somersd- is faster or slower than the alternatives that you mention. However, it uses the algorithm of Newson (2006b), which calculates a confidence interval in a time asymptotically proportional to NlogN, instead of in a time asymptotically proportional to the square of N, where N is the sample number.

If you are calculating Gini coefficients for a large number of subsets, then the -parmby- module of the -parmest- package might be useful. The -parmest- package can also be downloaded from SSC, using the -ssc- command.

I hope this helps.

Best wishes

Roger


References

Newson R. 2006a. Confidence intervals for rank statistics: Somers' D and extensions. The Stata Journal 6(3): 309-334.

Newson R. 2006b. Efficient calculation of jackknife confidence intervals for rank statistics. Journal of Statistical Software 15(1): 1-10.


Roger B Newson BSc MSc DPhil
Lecturer in Medical Statistics
Respiratory Epidemiology and Public Health Group
National Heart and Lung Institute
Imperial College London
Royal Brompton Campus
Room 33, Emmanuel Kaye Building
1B Manresa Road
London SW3 6LR
UNITED KINGDOM
Tel: +44 (0)20 7352 8121 ext 3381
Fax: +44 (0)20 7351 8322
Email: r.newson@imperial.ac.uk 
Web page: http://www.imperial.ac.uk/nhli/r.newson/
Departmental Web page:
http://www1.imperial.ac.uk/medicine/about/divisions/nhli/respiration/popgenetics/reph/

Opinions expressed are those of the author, not of the institution.

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Philipp Rehm
Sent: 15 May 2009 14:29
To: statalist@hsphsun2.harvard.edu
Subject: st: Fast way to calculate Gini coefficients

Dear all,

does someone know about the fastest way to compute gini coefficients of several income variables over many sub-groups (say, 2000) of a large dataset (with several million observations)? I am using Stata 10.1 on Windows XP.

There are many user-written programs calculating gini coefficients. I have been using the following:
(1) -egen_inequal- (by Michael Lokshin and Zurab Sajaia), followed by a -collapse- (although this can be sped up by using -keep-) and
(2)-fastgini- (by Zurab Sajaia), looping over groups and collecting the results in a matrix.

Both ways of calculation take pretty long (I haven't timed them against each other). Any ideas?

Many thanks,
Philipp



-- 
Neu: GMX FreeDSL Komplettanschluss mit DSL 6.000 Flatrate + Telefonanschluss für nur 17,95 Euro/mtl.!* http://dslspecial.gmx.de/freedsl-surfflat/?ac=OM.AD.PD003K11308T4569a
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index