Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: Question about cumulative density (cumul, xtile) -- quintiles and poverty status are not in sync. What am I doing wrong?


From   "Austin Nichols" <austinnichols@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: RE: Question about cumulative density (cumul, xtile) -- quintiles and poverty status are not in sync. What am I doing wrong?
Date   Fri, 3 Aug 2007 14:06:04 -0400

Anna Gueorguieva--
I doubt that switching between [aw] and [pw] affects your results,
since you are not calculating SEs and point estimates are unaffected
by the weight specification.  I suspect that the problem is that you
have a variable real_totc and a variable real_totc_per_ae and you
expect them to line up perfectly when they do not. Try instead:

bys wave: cumul real_totc_per_ae [aw=postpweight], gen(cdist)
gen quint = ceil(5 * cdist)

You might also just report the percent poor by wave, so you know where
in the distribution povline_m falls (and therefore how the tab of
quintile vs poor should look).

On 8/3/07, Nick Cox <n.j.cox@durham.ac.uk> wrote:
> I have thoughts on various levels. Others
> can advise better on survey statistics, the quantum
> mechanics of statistics, in which thinking you
> understand probably means you are not confused
> at a high enough level.
>
> 1. I see that you are using -cumul- with -aweight- and
> -xtile- with -pweight- and I would have thought it quite
> likely that that would be a source of difficulty.
> I know that -cumul- doesn't support -pweight-,
> but ignoring that problem won't be a solution.
>
> -cumul- itself is not a big deal; so as long
> as you can define the cumulative for a set
> of values with associated pweights, its
> computation should be easy, using -sum()-
> to cumulate.
>
> 2. You reported problems with -xtile2-.
> Please say where user-written programs you
> are using come from. -xtile2- is a user-written
> program from SSC. It gives you the error
> you report, I surmise, because it lacks
> a -version- statement and so is broken
> by a later change in Stata's syntax. This
> issue was discussed at length in a recent
> thread on -outreg2-. I think you need
> to insert
>
>        version 7
>
> after the -program- statement.
>
> 3. I think that the term "cumulative
> distribution function" is more nearly
> standard. I know that a density function
> is integrated to get a distribution
> function, but the result is no longer
> a density. Anyway, literatures may differ,
> but my impression is that cumulative
> density is not a standard term.
>
> 4. Code like
>
> gen quint=1 if cumdens<=.2
> replace quint=2 if cumdens>.2&cumdens<=.4
> replace quint=3 if cumdens>.4&cumdens<=.6
> replace quint=4 if cumdens>.6&cumdens<=.8
> replace quint=5 if cumdens>.8&cumdens~=.
>
> could be condensed to this
>
> gen quint = ceil(5 * cumdens)
>
> Here -ceil()- short for "ceiling"
> rounds up to the nearest integer. Also
> there is need to trap missings.
>
> -ceil()- is discussed in
>
> SJ-3-4  dm0002  . . . . . . . . Stata tip 2: Building with floors and ceilings
>        Q4/03   SJ 3(4):446--447                                 (no commands)
>        tips for using floor() and ceil()
>
> and again in "33 Stata Tips" available
> from StataCorp in paperback.
>
> Nick
> n.j.cox@durham.ac.uk
>
> Anna Gueorguieva
>
> > My quintiles and poverty status do not seem to be in sync and
> > I have no idea what is going on!!
> >
> > I generate my indicator for poor as having consumption per
> > adult equivalent below the poverty line.
> > gen poor=real_totc_per_ae<=povline_m
> >
> > Then I generate my quintiles by creating a cumulative
> > distribution of consumption (in each survey wave)
> > I use xtile and then double check it with the cumulative density.
> >
> > foreach x of numlist 1/4{
> > xtile quint`x'= real_totc_per_ae [pw=postpweight] if wave==`x',nq(5)
> > }
> > gen quint=quint1
> > foreach x of numlist 1/4{
> > replace quint=quint`x' if wave==`x'
> > }
> >
> > bys wave: cumul real_totc [aweight=postpweight], gen(cumdens)
> >
> > *check quints
> > replace quint=1 if cumdens<=.2
> > replace quint=2 if cumdens>.2&cumdens<=.4
> > replace quint=3 if cumdens>.4&cumdens<=.6
> > replace quint=4 if cumdens>.6&cumdens<=.8
> > replace quint=5 if cumdens>.8&cumdens~=.
> > And the problem is below -- it is not possible that there are
> > poor people in quintile 3!!!! :
> >   tab quint poor [aw=postpweight] if wave==2, col nof
> >    Quintile |
> >          of |
> >    national |
> > distributi |         poor
> >          on |         0          1 |     Total
> > -----------+----------------------+----------
> >           1 |      0.00      45.96 |     19.98
> >           2 |      0.40      45.44 |     19.98
> >           3 |     28.81       8.61 |     20.03
> >           4 |     35.19       0.00 |     19.89
> >           5 |     35.60       0.00 |     20.12
> > -----------+----------------------+----------
> >       Total |    100.00     100.00 |    100.00
> >
> > And xtile2 gives me an error:
> >   xtile2 newquint=real_totc_per_ae [pw=postpweight] ,nq(5) by(wave)
> > program error:  matching close brace not found
> > r(198);
>
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index