Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: cluster analysis validation


From   Paul Millar <paulmi@nipissingu.ca>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: cluster analysis validation
Date   Mon, 23 Apr 2012 15:10:49 -0400

Lisa,

If your dataset is small, or has certain characteristics, you might
get the same result every time.  You are right, the most commonly used
method is kmeans, I have not tried wardslinkage.

- Paul

On Mon, Apr 23, 2012 at 2:45 PM, Dasinger, Lisa <ldasinger@thezenith.com> wrote:
>
> Paul -  Thank you for your response.  I have been using -cluster
> wardslinkage-, which produces the same result as long as the same
> dataset is used.  It sounds like you are talking about kmeans.  In any
> case, I will look into your program as it may also be helpful here.
>
> Lisa
>
>
> Date: Thu, 19 Apr 2012 17:41:56 -0400
> From: Paul Millar <paulmi@nipissingu.ca>
> Subject: Re: st: cluster analysis validation
>
> Lisa,
>
> Cluster analysis is empirical - the group assignments are based on
> minimizing the "distance" between cases, given the number of groups.
> So a different sample entails different groups.  If you run cluster
> analysis many times on the same data you will also get different
> results for the same data (because the starting case is different,
> assigned randomly using -set seed-).  I have written a routine that
> tests the reliability of a groups assignment by sampling the group
> assignments and then estimating whether the probability of assignment
> in a particular group is > 0.5 in the population of group assignments.
>  See -help clustpop- after -ssc install clustpop-
>
> You can also do this after pooling the data, as suggested earlier.
>
> - - Paul
>
>
> Lisa Dasinger, Ph.D.
> Data Reporting Manager
> Claims Analytics
>
> Zenith Insurance Company
> Pleasanton Regional Office
> 4309 Hacienda Drive, Suite 200
> Pleasanton, CA 94588
>
> Phone: 925.416.5235
> RightFax: 925.460.1235
> Branch: 925.460.0600
> ldasinger@thezenith.com
>
> www.TheZenith.com
>
>
>
> ***********************************************************
> NOTICE:
> This e-mail, including attachments, contains information
> that may be confidential, protected by the attorney/client
> or other privileges, or exempt from disclosure under
> applicable law.  Further, this e-mail may contain
> information that is proprietary and/or constitutes a trade
> secret.  This e-mail, including attachments, constitutes
> non-public information intended to be conveyed only to the
> designated recipient of this communication, please be
> advised that any disclosure, dissemination, distribution,
> copying, or other use of this communication or any attached
> document is strictly prohibited.  If you have received this
> communication in error, please notify the sender
> immediately by reply e-mail and promptly destroy all
> electronic and printed copies of this communication and
> attached documents.
>
> ***********************************************************
>
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/



-- 
- Paul Millar, Ph.D.
School of Criminology and Criminal Justice
Nipissing University
North Bay, Ontario, Canada
www.paulmillar.ca

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index