# Re: st: are there any statistics rules that I can apply to separate numbers into groups?

 From Ada Ma To statalist@hsphsun2.harvard.edu Subject Re: st: are there any statistics rules that I can apply to separate numbers into groups? Date Wed, 11 Mar 2009 10:26:44 +0000

```Thank you to both Partha Deb and Kyle Hood for providing me with some
very promising looking leads to attempt.

Regards,

On Wed, Mar 11, 2009 at 7:11 AM, Kyle K. Hood <kyle.hood@yale.edu> wrote:
> In mapping, univariate classification schemes are used to group features
> together.  An example is Jenks' natural breaks, which simply defines k-1
> cutoffs to minimize within-group sums of square deviations from group means.
>  Unfortunately,
>
> . findit jenks
>
> produces nothing.  However, there is information on the web regarding how to
> compute these cutoffs (just google it).  I'm not sure how closely this
> method relates to cluster analysis and finite mixture models.
>
> Kyle
>
> Partha Deb wrote:
>>
>> Although one can never be sure what's in someone else's mind, I suspect
>> you are looking for cluster analysis. -help cluster- .  Finite mixture
>> models may also be of interest. -findit fmm- .  See
>> http://users.ox.ac.uk/~polf0050/ISS%20Lecture%208.pdf for a set of slides by
>> Stephen Fisher that has an introduction to Cluster analysis and finite
>> mixture models.
>>
>> Best.
>>
>> Partha
>>
>>
>>>
>>> Hi Statalisters,
>>>
>>> I am looking for a solution to a problem but I have no idea where to
>>> start.
>>>
>>> Let's say I have 50 packets of crisps of various weights and I would
>>> like to separate these 50 packets of crisps into five groups based on
>>> their weights in grams, as follows:
>>>
>>> 108.9702
>>> 111.1337
>>> 112.5217
>>> 112.6697
>>> 112.9962
>>> 114.0323
>>> 114.6699
>>> 116.8646
>>> 119.0719
>>> 124.5645
>>> 124.691
>>> 126.4943
>>> 126.5528
>>> 133.5675
>>> 134.9519
>>> 140.7979
>>> 144.228
>>> 102.8566
>>> 103.9373
>>> 104.7436
>>> 107.5397
>>> 109.4443
>>> 109.7089
>>> 110.395
>>> 112.1248
>>> 113.6032
>>> 115.6405
>>> 117.1919
>>> 120.0395
>>> 121.0714
>>> 121.7119
>>> 110.1116
>>> 112.0128
>>> 117.6563
>>> 118.2418
>>> 126.0027
>>> 127.8855
>>> 92.21352
>>> 92.45715
>>> 92.953
>>> 93.01508
>>> 94.05335
>>> 94.27259
>>> 94.38242
>>> 94.72507
>>> 94.83315
>>> 95.25914
>>> 95.37813
>>> 95.52933
>>>
>>> I don't want to separate them into five equally sized groups.  I want
>>> to separate the packets into groups so that the group members are most
>>> similar to one another.  I am looking for a method (or methods?) to
>>> achieve this end but I don't know where to start.  If you can think of
>>> any suggestion please fire away and I'd be most grateful!
>>>
>>> Regards,
>>>
>>>
>>>
>>
>
> --
> Kyle Hood
> Department of Economics
> Yale University
> New Haven, CT
> website: http://www.econ.yale.edu/~kkh25
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

--
Research Fellow
Health Economics Research Unit
University of Aberdeen, UK.
http://www.abdn.ac.uk/heru/
Tel: +44 (0) 1224 553863
Fax: +44 (0) 1224 550926

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```