Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: AW: Create a flag variable for 10 most frequent values


From   Sergiy Radyakin <[email protected]>
To   [email protected]
Subject   Re: st: AW: Create a flag variable for 10 most frequent values
Date   Mon, 16 Nov 2009 18:03:16 -0500

suppose you have data with two vars: name and diagnosis (or make and mpg)

and you want to add "top10" dummy to that.
You keep one person for each diagnosis
After you -expand- there will be N persons with the same name?
Can you show this with auto.dta?
S.R.




On Mon, Nov 16, 2009 at 5:36 PM, Martin Weiss <[email protected]> wrote:
>
> <>
>
> What do you want to know? I collapse (fineprint: no hyphens around it as I
> use -keep- to do it) the thing to be able to -sort- on "mycount" and assign
> the flag that Elan requested. Once that is done, I want my original data
> back, so I -expand- it back to its former glory. Any suggestions for
> improvements are welcome...
>
>
>
> HTH
> Martin
>
>
> -----Ursprüngliche Nachricht-----
> Von: [email protected]
> [mailto:[email protected]] Im Auftrag von Sergiy Radyakin
> Gesendet: Montag, 16. November 2009 23:33
> An: [email protected]
> Betreff: Re: st: AW: Create a flag variable for 10 most frequent values
>
> Martin, could you please explain how -expand- is used here?
> Best, Sergiy
>
> On Mon, Nov 16, 2009 at 5:14 PM, Martin Weiss <[email protected]> wrote:
>>
>> <>
>>
>> Here is a strategy:
>>
>>
>> *************
>> clear*
>>
>> //construct data
>> set obs 10000
>> gen dx=1+int(100*runiform())
>>
>> //see freqs
>> ta dx
>> //use ben jann`s -fre-
>> capture which fre
>> if _rc ssc install fre
>> fre dx, desc
>>
>> //get counts next to "dx"s
>> bys dx: egen mycount=count(dx)
>>
>> //collapse to one per group
>> bys dx: keep if _n==1
>> //-sort- on count var
>> sort mycount
>> //take the last ten
>> gen byte mostfreq=inrange(_n,`=_N-9',_N)
>> //and back as we were
>> expand mycount
>>
>> //see result
>> ta myc mostfreq
>> *************
>>
>>
>>
>> HTH
>> Martin
>>
>>
>> -----Ursprüngliche Nachricht-----
>> Von: [email protected]
>> [mailto:[email protected]] Im Auftrag von Cohen, Elan
>> Gesendet: Montag, 16. November 2009 22:25
>> An: '[email protected]'
>> Betreff: st: Create a flag variable for 10 most frequent values
>>
>> Hi all,
>>
>> I have a string variable dx that represents a patient's diagnosis (about
>> 5,000 unique values).  I'd like to create a "top 10 flag" that equals 1 if
>> dx is one of the top 10 most frequent diagnoses and 0 otherwise.
>>
>> I'm not even sure where to begin.  If someone could point me in the right
>> direction, I'd be grateful.  Stata 10, Windows XP
>>
>> Thank you,
>>
>> - Elan
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index