Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: AW: Create a flag variable for 10 most frequent values


From   "Martin Weiss" <martin.weiss1@gmx.de>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: AW: Create a flag variable for 10 most frequent values
Date   Mon, 16 Nov 2009 22:50:58 +0100

<> 

Here is a strategy:


*************
clear*

//construct data
set obs 10000
gen dx=1+int(100*runiform())

//see freqs 
ta dx
//use ben jann`s -fre-
capture which fre
if _rc ssc install fre 
fre dx, desc

//get counts next to "dx"s
bys dx: egen mycount=count(dx)

//collapse to one per group
bys dx: keep if _n==1
//-sort- on count var
sort mycount
//take the last ten
gen byte mostfreq=inrange(_n,`=_N-9',_N)
//and back as we were
expand mycount

//see result
ta myc mostfreq
*************



HTH
Martin


-----Ursprüngliche Nachricht-----
Von: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von Cohen, Elan
Gesendet: Montag, 16. November 2009 22:25
An: 'statalist@hsphsun2.harvard.edu'
Betreff: st: Create a flag variable for 10 most frequent values

Hi all,

I have a string variable dx that represents a patient's diagnosis (about
5,000 unique values).  I'd like to create a "top 10 flag" that equals 1 if
dx is one of the top 10 most frequent diagnoses and 0 otherwise.  

I'm not even sure where to begin.  If someone could point me in the right
direction, I'd be grateful.  Stata 10, Windows XP

Thank you,

- Elan

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index