Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Quintiles

 From Nick Cox To statalist@hsphsun2.harvard.edu Subject Re: st: Quintiles Date Thu, 9 Aug 2012 11:38:35 +0100

```If I read this correctly, Leonardo agrees that exactly equal
frequencies may be impossible with -xtile- but wants to appear to do
it exactly by subterfuge, using weights.

This can be done:

. sysuse auto
. xtile qmpg = mpg, n(5)
. tab qmpg

5 quantiles |
of mpg |      Freq.     Percent        Cum.
------------+-----------------------------------
1 |         18       24.32       24.32
2 |         17       22.97       47.30
3 |         13       17.57       64.86
4 |         12       16.22       81.08
5 |         14       18.92      100.00
------------+-----------------------------------
Total |         74      100.00

. bysort qmpg : gen w = 1/_N

. tabstat w , by(qmpg)  s(n sum)

Summary for variables: w
by categories of: qmpg (5 quantiles of mpg)

qmpg |         N       sum
---------+--------------------
1 |        18         1
2 |        17         1
3 |        13         1
4 |        12         1
5 |        14         1
---------+--------------------
Total |        74         5
------------------------------

However, why is exact equality such a big deal here? Why coarsen when
you have quantitative information to hand?

http://www.stata.com/statalist/archive/2012-06/msg01193.html on how
-xtile- on a negated version of a variable may (or may not) work
better.

Nick

On Thu, Aug 9, 2012 at 9:16 AM, Maarten Buis <maartenlbuis@gmail.com> wrote:
> On Wed, Aug 8, 2012 at 9:44 PM, Leonardo Jaime Gonzalez Allende wrote:
>> I don't was planning to cut a person or household in many parts. The question was about a possible adjustment to the weight factor, if the observation of the sample is the cut point of the quintile.
>>
>> If I sort the households of a sample by their incomes, a household "x" could represents 300 households but the accumulated frequency of the population is e.g. 20,02%.
>>
>> My question was if there is an efficient way (command) to repeat the observation and adjust weight factor as follow:
>>
>> the same household "xa" now represents 280 households and now the accumulated frequency of the population is e.g. 20% (exactly) (leaving to the first quintile).
>
> What kind of weight did you have in mind, aweigths, pweights,
> iweights, fweights? Weighting can be a remarkably tricky issue. There
> are many ways such a procedure could go wrong, and I don't know if
> there is way to get it right. Anyhow, I cannot imagine a situation
> where such an effort would be worth the cost (but that may just as
> well say something about a lack of imagination on my part). I would
> just live with the fact that the discrete nature of the number of
> observations leads to slight variations in group size.
>
> Did you look at the possibility that ties (different people reporting
> exactly the same income) are the source of differences in group size?
> In theory, such ties should be pretty rare for a (semi-)continuous
> variable like income. However, in practice respondents tend to round
> their answers, making such ties a lot more common.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```