"Nick Cox" <n.j.cox@durham.ac.uk>

<statalist@hsphsun2.harvard.edu>

st: RE: Sorting/ranking Q from new user

Fri, 1 Aug 2003 10:13:07 +0100

Eric VonDohlen > I have a continuous variable X, which I would like to: > > (a) sort in ascending or descending order; > (b) rank the sorted X into some specified number of groups; > (c) report the mean of X (or some other statistic) by group. Jayesh Kumar replied and pointed to -gsort- for (a). Fine. On (b) and (c) Jayesh suggested > *This will create percentiles, you can choose your own number of groups. > *for ranking purpose: > by year:gen a=_n > bysort year: egen b=max(a) > gen percentile_year=((a/b)*100) > *for reporting summary statistics: > bysort percentile_year: summarize year This is an interesting approach, but it needs to be followed by some fixes and a couple of warnings. I don't think it is general enough to be the best answer to Eric's question. A small fix is that the first command depends on observations being in the right -sort- order, so the -bysort- is needed on that (and not needed on the second): bysort year: gen a = _n by year: egen b = max(a) gen percentile_year = ((a/b)*100) As a matter of Stata style only, this can be condensed to bysort year : gen percentile_year = (_n/_N) * 100 The first major problem is that whatever is of interest should be sorted within each -year- (if not, the assignment of percentiles is quite arbitrary). bysort year (whatever) : gen percentile_year = (_n/_N) * 100 Two other major problems: * No adjustment for ties. Tied values will get assigned to different percentiles. * This works best when there is an equal number of observations within each group, but not otherwise. (Suppose there were 4 observations in each -year-. -percentile_year- would take on values 25, 50, 75, 100.) A more general answer to Eric question's is to use -xtile- and then -summarize-, -tabstat-, etc. (and to read the manual; new users are expected to read the manual like everybody else!). Nick n.j.cox@durham.ac.uk * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

