Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: Re: Create a flag variable for 10 most frequent values


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: Re: Create a flag variable for 10 most frequent values
Date   Wed, 18 Nov 2009 15:33:55 -0000

Indeed, a mode in the strict sense is the location of the peak of the
probability, with differences in terminology in discrete and continuous
cases (and with an understanding that such a peak may occur twice or
more in various symmetrical examples). But many people are interested in
modes in the wide sense, which are local peaks. I see no reason to limit
a program by applying some censorship according to what really isn't a
mode in some authorities' view. 

In practice the definition is pretty fuzzy. I'm sure that many of us
have been brought up short by a student describing something as bimodal,
when a bundle of experiences with histograms make more experienced
people very wary of the consequences of sampling variation and binning
artifacts. We often mutter something like "Well, in practice, it's got
to be _strongly_ bimodal before you _call_ it bimodal" and then wonder
why students think statistics to be confusing. 

-modes- will happily report that all the values that occur once in a
variable with no ties are modes. It's for the user to decide whether
that was a silly question, or just the wrong question. 

Nick 
n.j.cox@durham.ac.uk 

Kit Baum

I guess some of the difficulty here is semantic. When I first read the
post it did not occur to me that a user wanting to know what the most
frequent values are, in terms of most frequent, second m.f., third m.f.,
... tenth m.f. should think about modes. Surely the mode is THE m.f.
value, and we often speak of a distribution being bimodal or even
multi-modal. But the use of the term 'modes' to represent, say, the 100
most frequently occurring given names among those born in 2008 is not
obvious to me. 

Nevertheless, it is a very good thing that Nick's -modes- might answer
that question!

On Nov 18, 2009, at 2:33 AM, statalist-digest wrote:

> There is no guarantee that questioners use the most appropriate word,
but "most frequent" might suggest "mode" to many statistically-minded
people.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index