Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: suppressing low frequency observations in tabulation


From   Nick Cox <[email protected]>
To   [email protected]
Subject   Re: st: suppressing low frequency observations in tabulation
Date   Thu, 25 Oct 2012 00:45:00 +0100

Although I retain a certain moderate affection for this program, there
are other solutions, including

. contract drugname
. sort _freq
. keep in -25/L
. tab drugname [fw=_freq] , sort

On Wed, Oct 24, 2012 at 11:49 PM, Nick Cox <[email protected]> wrote:
> This problem is addressed by the user-written program -modes-
> originally published in STB-50 in 1999:
>
> STB-50  sg113 . . . . . . . . . . . . . . . . . . . . . .  Tabulation of modes
>         (help modes if installed) . . . . . . . . . . . . . . . . .  N. J. Cox
>         7/99    pp.26--27; STB Reprints Vol 9, pp.180--181
>         provides table of most frequent observations (modes)
>
> The software was updated in Stata Journal 3(2) (2003) and 9(4) (2009)
> so that the most recent version can be installed after typing
>
> . net describe sg113_2, from(http://www.stata-journal.com/software/sj9-4)
>
> Nick
>
> On Wed, Oct 24, 2012 at 11:08 PM, Kevin McConeghy
> <[email protected]> wrote:
>
>> I have a large dataset, roughly 6.5mill obs, which is the FDA adverse
>> event database. Variable drugname is the string describing the drug.
>>
>> . describe drugname
>>
>>               storage  display     value
>> variable name   type   format      label      variable label
>> ---------------------------------------------------------------------------------------------------------------------------------------------------
>> drugname        str30  %30s
>>
>> I want to create a frequency table of the top 25 drug "offenders" in
>> the database, however I am having trouble figuring out how to get
>> Stata to perform the tab drugname command without including all the
>> low frequency observations from random drugs (which causes stata to
>> stop the command becuase "too many values"). I can't see an option for
>> this in the syntax. Any advice on how to filter out all the background
>> noise for this?
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index