Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: poverty/inequality analysis - follow-up


From   "Austin Nichols" <austinnichols@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: poverty/inequality analysis - follow-up
Date   Wed, 30 Jul 2008 15:53:48 -0400

Lola--
1) x is a grid for kdensity to use so densities can be compared at the
same x values.  See the at() option in -help kdensity-
2) No reason, really, though skewed distributions might need some more
work to get a sensible grid for -kdensity- esp. where the density
bunches up at the low end (the part you care about)
3) The "add" variable below is zero at the median and has mean zero.
So it is median- and mean-preserving--and that is what the -summarize-
command at the end of the code snippet shows.

A more sensible transformation would not change people's ranks, which
would require moving people some proportion of their distance from
their nearest neighbor (working outward from the two observations
closest to the mean, say) but this is a lot more work, since you have
to pick different factors for each observation (especially hard if
engineered to hit some target poverty rate or SD).

If you have sample weights, to make "add" you should use weights
instead of _n as in e.g.
http://www.stata.com/statalist/archive/2008-01/msg00330.html

I want to reiterate that I don't think you can really get sensible
results out of my "mean-preserving squeeze" operation, but I would be
happy to be proven wrong.

On Wed, Jul 30, 2008 at 7:41 AM, Lola Jackson <lola_jackson@ymail.com> wrote:
<snip>
> 1) What is the reason for generating the variable x, and especially for doing this for only about half of the observations (specified as "in 1/300" whereas there are almost 600 observations in the psidextract dataset once the command "keep if t==7" has been implemented). I couldn't figure out why x is needed and is included in the subsequent kdensity and line commands. I was thinking of just dropping this from my modifed version of the code, but was afraid that it might be very important and I would be making a big mistake in dropping this aspect.
>
> 2) Sorry if this is a stupid question, but is there any particularly reason to work in logs rather than original values, for this type of work?
>
> 3) Perhaps my most important question: The code to generate lw2 actually seems to be MEDIAN-preserving and not mean-preserving (the relevant parts of the code for this are "g add=-((_n-1)*2/594-1)*.2; g lw2=add+lwage"). The "add" variable is 0 for the median person and hence lw2 is unchanged for that person, but raised for lower income earners and lowered for higher income earners, and the mean does actually change, unlike what is needed. I have tried to come up with a different method for preserving the mean (while reducing spread) but without success. What I need to do here is equivalent to a "stretch" as in the 3 S's of [(Stephen P. Jenkins and Philippe Van Kerm) "Accounting for income distribution trends: A density function decomposition approach" in Journal of Economic Inequality, 3(1), April 2005, 43-61] but in 'reverse' as I want to reduce not increase spread.
>
> (I believe that after doing this simple exercise I should be able to work 'backwards' using the -gidecomposition- command to decompose the resultant change in poverty into the growth, redistribution, and residual/interaction components; and with the new mean-preserving reduced-spread distribution the growth component should be 0 and the redistribution component should account for the entire reduction in poverty, similarly with the new distribution-preserving higher-mean distribution the growth component from -gidecomposition- should explain the entire resultant change in poverty.)
>
> Austin's simple suggested code (which he noted was just a silly example with important caveats):

webuse psidextract, clear
keep if t==7
g x=_n/100+5.6 in 1/300
kdensity lwage, at(x) g(d0) nogr
g lw1=lwage+.2
kdensity lw1, at(x) g(d1) nogr
sort lwage
g add=-((_n-1)*2/594-1)*.2
g lw2=add+lwage
kdensity lw2, at(x) g(d2) nogr
line d0 d1 d2 x, sort xli(6.2)
su lw*

*not in logs:
g wage=exp(lwage)
g a2wage=-((_n-1)*2/594-1)*200
g w2=a2wage+wage
su w2 wage a2wage
tw kdensity wage || kdensity w2
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index