[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
"Martin Weiss" <martin.weiss1@gmx.de> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: AW: Create a flag variable for 10 most frequent values |

Date |
Tue, 17 Nov 2009 00:37:50 +0100 |

<> Good point! I always make up my own dataset according to the description in the initial post, and in this case, my dataset may have been too simple. Still, Elan can -merge- back with the original dataset, with "diagnosis" as her key. *** sysuse auto, clear keep mpg bys mpg: egen mycount=count(mpg) //collapse to one per group bys mpg: keep if _n==1 //-sort- on count var sort mycount //take the last ten gen byte mostfreq=inrange(_n,`=_N-9',_N) //and back as we were expand mycount merge m:m mpg /* */ using "C:\Program Files (x86)\Stata11\auto.dta", /* */ nogenerate nolabel nonotes *** You need to substitute the path to your auto dataset in the last line... HTH Martin -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Sergiy Radyakin Sent: Dienstag, 17. November 2009 00:03 To: statalist@hsphsun2.harvard.edu Subject: Re: st: AW: Create a flag variable for 10 most frequent values suppose you have data with two vars: name and diagnosis (or make and mpg) and you want to add "top10" dummy to that. You keep one person for each diagnosis After you -expand- there will be N persons with the same name? Can you show this with auto.dta? S.R. On Mon, Nov 16, 2009 at 5:36 PM, Martin Weiss <martin.weiss1@gmx.de> wrote: > > <> > > What do you want to know? I collapse (fineprint: no hyphens around it as I > use -keep- to do it) the thing to be able to -sort- on "mycount" and assign > the flag that Elan requested. Once that is done, I want my original data > back, so I -expand- it back to its former glory. Any suggestions for > improvements are welcome... > > > > HTH > Martin > > > -----Ursprüngliche Nachricht----- > Von: owner-statalist@hsphsun2.harvard.edu > [mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von Sergiy Radyakin > Gesendet: Montag, 16. November 2009 23:33 > An: statalist@hsphsun2.harvard.edu > Betreff: Re: st: AW: Create a flag variable for 10 most frequent values > > Martin, could you please explain how -expand- is used here? > Best, Sergiy > > On Mon, Nov 16, 2009 at 5:14 PM, Martin Weiss <martin.weiss1@gmx.de> wrote: >> >> <> >> >> Here is a strategy: >> >> >> ************* >> clear* >> >> //construct data >> set obs 10000 >> gen dx=1+int(100*runiform()) >> >> //see freqs >> ta dx >> //use ben jann`s -fre- >> capture which fre >> if _rc ssc install fre >> fre dx, desc >> >> //get counts next to "dx"s >> bys dx: egen mycount=count(dx) >> >> //collapse to one per group >> bys dx: keep if _n==1 >> //-sort- on count var >> sort mycount >> //take the last ten >> gen byte mostfreq=inrange(_n,`=_N-9',_N) >> //and back as we were >> expand mycount >> >> //see result >> ta myc mostfreq >> ************* >> >> >> >> HTH >> Martin >> >> >> -----Ursprüngliche Nachricht----- >> Von: owner-statalist@hsphsun2.harvard.edu >> [mailto:owner-statalist@hsphsun2.harvard.edu] Im Auftrag von Cohen, Elan >> Gesendet: Montag, 16. November 2009 22:25 >> An: 'statalist@hsphsun2.harvard.edu' >> Betreff: st: Create a flag variable for 10 most frequent values >> >> Hi all, >> >> I have a string variable dx that represents a patient's diagnosis (about >> 5,000 unique values). I'd like to create a "top 10 flag" that equals 1 if >> dx is one of the top 10 most frequent diagnoses and 0 otherwise. >> >> I'm not even sure where to begin. If someone could point me in the right >> direction, I'd be grateful. Stata 10, Windows XP >> >> Thank you, >> >> - Elan >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ >> >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/statalist/faq >> * http://www.ats.ucla.edu/stat/stata/ >> > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: AW: Create a flag variable for 10 most frequent values***From:*Nick Winter <nwinter@virginia.edu>

**References**:**st: Create a flag variable for 10 most frequent values***From:*"Cohen, Elan" <cohened@upmc.edu>

**Re: st: AW: Create a flag variable for 10 most frequent values***From:*Sergiy Radyakin <serjradyakin@gmail.com>

**Re: st: AW: Create a flag variable for 10 most frequent values***From:*Sergiy Radyakin <serjradyakin@gmail.com>

- Prev by Date:
**st: RE: constraint estimates in panel regressions** - Next by Date:
**Re: st: Create graph directly from matrix?** - Previous by thread:
**Re: st: AW: Create a flag variable for 10 most frequent values** - Next by thread:
**Re: st: AW: Create a flag variable for 10 most frequent values** - Index(es):

© Copyright 1996–2016 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |