Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: how to group variables into equal number groups


From   Maarten Buis <maartenlbuis@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: how to group variables into equal number groups
Date   Tue, 26 Mar 2013 16:07:57 +0100

On Tue, Mar 26, 2013 at 3:30 PM, Xixi Lin wrote:
>  I am trying to make independent variables into decile groups, and I
> used xtile decile=x1 if Period==`z', nq(10); however, it turns out
> that xtile does not make equal number of the 10 groups, is there any
> way to force stata to divide them into equal number of obs or almost
> equal number of obs?

This is a problem of logic rather than programming: you can only split
a set of observation in 10 groups of _exactly_ the same size if the
number of observations is a multiple of 10. If you have 10
observations, you can split that up in 10 groups of 1 observation
each, but what do you do if you have eleven observations? Another
possible reason would be ties in the variable x1. Say your variable x1
has 4 observations with values 1, 1, 2, 3 and you want to split it up
in exactly 4 groups. The number of observations is not a problem, as
it is an exact multiple of 4, but how do you split up the first two
observations in two groups?

Since you report large deviations, that suggests to me that the last
problem is the one that realy bites you. The sollution is to look at
what the values are that are so heavily tied and see if they make
sense. For example, in most western countries you would expect to see
a huge spike at 40 in a question on the usual number of hours per week
that a respondent works. Another example would be if you have a
measure of occupational status, which people usually think of as
continuous, you will still see large spikes at values that correspond
to the status values assigned to very common occupations like nurses
and teachers. It is this type of deeper understanding of your data
that can guide you on what to do next, which will typically be not
splitting up your variable in 10 equally sized groups.

-- Maarten

---------------------------------
Maarten L. Buis
WZB
Reichpietschufer 50
10785 Berlin
Germany

http://www.maartenbuis.nl
---------------------------------
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index