Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Zeros and measures of inequality or concentration


From   Troy Payne <paynetc@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Zeros and measures of inequality or concentration
Date   Thu, 9 Feb 2012 08:00:25 -0900

Thanks to Nick Cox and David Hoaglin for the suggestion to use Poisson
or zero-inflated models.  I've used those in the past when modeling
the effect of independent variables on crime.  Here, my purpose is
more descriptive; I have no predictors to model.

Thanks also to Stephen Jenkins and Roger Newson for suggestions to use
-ineqdec0- and -scsomersd- packages.  I'll do that and read their
documentation.

The help is much appreciated.

--
Troy Payne
Email: paynetc@gmail.com




On Wed, Feb 8, 2012 at 8:17 PM, Troy Payne <paynetc@gmail.com> wrote:
> I have a more statistical question than a Stata-related question:  Which measure of inequality or concentration is best for data with a large number of observations with a value of zero?
>
> While I haven't used them before, it seems that Lorenz curves, Gini coefficients, and other related measures of inequality would be a good way to examine concentrations of crime at addresses.  Like income, crime tends to be highly concentrated, with a relative handful of places contributing large proportions to the total crime count.  In fact, at the place-level (address or street segment) the most common crime count is often zero.
>
> I have crime data at apartment buildings in a midwestern city.  In my data, 45% of apartments had zero crimes in any given year.  If I include only violent crimes, then 74% of apartments have zero crimes in any given year.
>
> Posts here on Statalist lead me to -inequal-, -sgini-, -lorenz-, and -glcurve- (all installed in Stata 12.1, all available via SSC).  Judging from the r(N) returned, -inequal- seems to explicitly exclude observations with values of zero, while -sgini- does not.  It's difficult for me to tell if -lorenz- and -glcurve- include observations with values of zero, even after reading the help files and other documentation provided.
>
> Nearly all of what I've read about these various inequality measures so far seems to assume non-zero values, or at least that zero values are rare.  I'm unsure what the practical impact of a large proportion of zeros would have, even for user-written commands that appear to allow them.
>
> Until two days ago, I had never dug into the details of Gini coefficients.  It's possible that the documentation has the answer and I've just missed it.  I'd very much appreciate any guidance list members could give.
>
> —
> Troy Payne
> paynetc@gmail.com
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index