Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: SQ-Ados sequence analysis: problems in clustering


From   "BUSSI, Margherita" <mbussi@etui.org>
To   "'statalist@hsphsun2.harvard.edu'" <statalist@hsphsun2.harvard.edu>
Subject   st: SQ-Ados sequence analysis: problems in clustering
Date   Thu, 11 Apr 2013 15:48:47 +0000

Dear Stata-users,

I've encountered some problem while running an optimal matching analysis on dataset counting 3700 observations (24 periods = in long format 88800 person-period). There are no gaps in sequences.
My sequence variable is labour market position it counts 8 values 

define lm 1 "employed" ///
            2 "self-employed" ///
            3 "unempl" ///
            4 "stage attente" ///
            5 "unempl after studies" ///
            6 "education" ///
            7 "social assis" ///
            8 "other" 
label values lm lm

I've sqset(ed) the data set and did that 

sqset lm pid order
sqom, subcost(minprobdistance)
sqclustermat wardslinkage SQdist, name(wards) add

then an error message appears saying

/dissimilarity matrix:  too many specified

I know that  Christian Brzinsky-Fay et al (2006) say that: "There is one trap in applying the clustermat command to the dissimilarity matrix
created by sqom, which stems from the fact that the sequence data and the dissimilarity matrix have different dimensions. .
The dissimilarity matrix cannot be attached to the sequence data on a row-by-row basis, which also applies to the results
from the cluster analysis of the dissimilarity matrix. The SQ-Ados therefore contain a command that helps to link the results of the user-specified clustermat command to the original sequence data. Its syntax is

. sqclusterdat
. clustermat wardslinkage SQdist, name(wards) add
. clustermat singlelinkage SQdist, name(single) add
. cluster tree wards, cutnumber(20)
. sqclusterdat, return

But even applying this syntax, it says that the matrix does not fit the sequence data.

I'm sure I made a mistake...Has anyone already had this problem or clue how to solve it? 

Thanks a lot in advance!

Regards,

Margherita 

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index