Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <njcoxstata@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Zeros and measures of inequality or concentration |

Date |
Thu, 9 Feb 2012 09:25:55 +0000 |

I don't think there is any "best" here without a more detailed statement of your criteria and of what you want to do. It so happens that most of the more conspicuous user-written Stata programs are written by economists with income data in mind and it happens that with incomes zero values are, I gather, variously regarded as implausible, or a kind of missing, or not relevant to the question, or just difficult to handle given other analytical choices (e.g. log zero is indeterminate). I do agree that it is clear that zeros are a major part of what you want to handle and that any program that excludes zeros is thereby excluded from consideration. That doesn't mean you have to adopt one of the others. (Use -viewsource- to look at the code if documentation is unclear about zeros; or see what happens if you feed a variable all zeros to any program.) In many ways wanting to reduce a distribution to even a few single measures is a primitive urge and in much of the best current research (e.g. in economics, again, or in ecology) calculating omnibus measures is usually combined with attempts to model the entire distribution (of income, species abundance, ...). Stata is rich in such modelling commands, but -poisson- remains the place to start, I would have thought, even if you decide quickly that it is not a good fit. Nick On Thu, Feb 9, 2012 at 5:17 AM, Troy Payne <paynetc@gmail.com> wrote: > I have a more statistical question than a Stata-related question: Which measure of inequality or concentration is best for data with a large number of observations with a value of zero? > > While I haven't used them before, it seems that Lorenz curves, Gini coefficients, and other related measures of inequality would be a good way to examine concentrations of crime at addresses. Like income, crime tends to be highly concentrated, with a relative handful of places contributing large proportions to the total crime count. In fact, at the place-level (address or street segment) the most common crime count is often zero. > > I have crime data at apartment buildings in a midwestern city. In my data, 45% of apartments had zero crimes in any given year. If I include only violent crimes, then 74% of apartments have zero crimes in any given year. > > Posts here on Statalist lead me to -inequal-, -sgini-, -lorenz-, and -glcurve- (all installed in Stata 12.1, all available via SSC). Judging from the r(N) returned, -inequal- seems to explicitly exclude observations with values of zero, while -sgini- does not. It's difficult for me to tell if -lorenz- and -glcurve- include observations with values of zero, even after reading the help files and other documentation provided. > > Nearly all of what I've read about these various inequality measures so far seems to assume non-zero values, or at least that zero values are rare. I'm unsure what the practical impact of a large proportion of zeros would have, even for user-written commands that appear to allow them. > > Until two days ago, I had never dug into the details of Gini coefficients. It's possible that the documentation has the answer and I've just missed it. I'd very much appreciate any guidance list members could give. > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Zeros and measures of inequality or concentration***From:*Troy Payne <paynetc@gmail.com>

- Prev by Date:
**Re: st: MIXLOGIT: marginal effects** - Next by Date:
**Re: st: MIXLOGIT: marginal effects** - Previous by thread:
**st: Zeros and measures of inequality or concentration** - Next by thread:
**Re: st: Zeros and measures of inequality or concentration** - Index(es):