Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

AW: st: AW: Create a flag variable for 10 most frequent values


From   "Martin Weiss" <martin.weiss1@gmx.de>
To   <statalist@hsphsun2.harvard.edu>
Subject   AW: st: AW: Create a flag variable for 10 most frequent values
Date   Mon, 16 Nov 2009 23:36:09 +0100

<> 

What do you want to know? I collapse (fineprint: no hyphens around it as I
use -keep- to do it) the thing to be able to -sort- on "mycount" and assign
the flag that Elan requested. Once that is done, I want my original data
back, so I -expand- it back to its former glory. Any suggestions for
improvements are welcome...



HTH
Martin


-----Ursprüngliche Nachricht-----
Von: owner-statalist@hsphsun2.harvard.edu
[mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von Sergiy Radyakin
Gesendet: Montag, 16. November 2009 23:33
An: statalist@hsphsun2.harvard.edu
Betreff: Re: st: AW: Create a flag variable for 10 most frequent values

Martin, could you please explain how -expand- is used here?
Best, Sergiy

On Mon, Nov 16, 2009 at 5:14 PM, Martin Weiss <martin.weiss1@gmx.de> wrote:
>
> <>
>
> Here is a strategy:
>
>
> *************
> clear*
>
> //construct data
> set obs 10000
> gen dx=1+int(100*runiform())
>
> //see freqs
> ta dx
> //use ben jann`s -fre-
> capture which fre
> if _rc ssc install fre
> fre dx, desc
>
> //get counts next to "dx"s
> bys dx: egen mycount=count(dx)
>
> //collapse to one per group
> bys dx: keep if _n==1
> //-sort- on count var
> sort mycount
> //take the last ten
> gen byte mostfreq=inrange(_n,`=_N-9',_N)
> //and back as we were
> expand mycount
>
> //see result
> ta myc mostfreq
> *************
>
>
>
> HTH
> Martin
>
>
> -----Ursprüngliche Nachricht-----
> Von: owner-statalist@hsphsun2.harvard.edu
> [mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von Cohen, Elan
> Gesendet: Montag, 16. November 2009 22:25
> An: 'statalist@hsphsun2.harvard.edu'
> Betreff: st: Create a flag variable for 10 most frequent values
>
> Hi all,
>
> I have a string variable dx that represents a patient's diagnosis (about
> 5,000 unique values).  I'd like to create a "top 10 flag" that equals 1 if
> dx is one of the top 10 most frequent diagnoses and 0 otherwise.
>
> I'm not even sure where to begin.  If someone could point me in the right
> direction, I'd be grateful.  Stata 10, Windows XP
>
> Thank you,
>
> - Elan
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index