Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Ordering a dataset by frequency


From   wgould@stata.com (William Gould, Stata)
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Ordering a dataset by frequency
Date   Mon, 05 Mar 2007 11:54:35 -0600

jmm50@duke.edu wrote, 

> I have a dataset of 50,000 names.  I need to order them by frequency so 
> that the names that recur the most are at the top of the list, decreasing 
> by frequency.  [...]

If I just wanted to list the names and frequencies, in effect making a
one-way tabulation, I would type 

        . use dataset 
        . keep name 
        . sort name 
        . by name: gen freq = _N
        . by name: keep if _n==1
        . gsort -freq name
        . list name freq

If I wanted to save the frequencies with the original data:

        . use dataset 
        . sort name
        . by name: gen freq=_N
        . save, replace

If I wanted to list the entire dataset with the most frequent names on top, 

        . gen negfreq = -freq
        . sort negfreq name

-- Bill
wgould@stata.com
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index