[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: clustering on string similarity

From	Dan Weitzenfeld <[email protected]>
To	[email protected]
Subject	Re: st: clustering on string similarity
Date	Fri, 1 May 2009 17:59:25 -0700

For posterity - someone pointed me to -seqcomp-, which was exactly was
I was looking for.

On Fri, May 1, 2009 at 12:17 PM, Dan Weitzenfeld
<[email protected]> wrote:
> Hi Folks,
> In working with eye-tracking data, a person's sequence of
> areas-of-interest viewed (a "scan pattern") are often represented as
> strings.  E.g., my scan-pattern in the first 5 seconds of looking at a
> webpage might be PPHM, with P = picture, H = headline, and M = side
> menu.
> For this reason, I am interested in clustering on string similarity,
> to identify commonly-taken scan patterns in a dataset.
> It looks like my best bet is to create a dissimilarity matrix (using
> Levenshtein distance has the dissimilarity measure) and then use
> -clustermat-.
>
> My questions are:
>   -are there any packages out there that would make this easier?
>   -am I right that I will have to write a program to make the matrix?
>  I'm fine with writing it, I just want to confirm that I'm not missing
> an easier way to do this.
>
> Thanks in advance,
> Dan
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>



-- 
Dan Weitzenfeld
Media Analyst
EmSense Corporation
512 2nd Street, 3rd Floor
San Francisco, CA 94107
w: 415.418.7314
m: 510.552.0106
[email protected]

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: clustering on string similarity
  - From: Dan Weitzenfeld <[email protected]>

Prev by Date: st: Re: st: RE: Fisher´s exact test for rxc [2X2] tables: one-tailed or two-tailed[iso-8859-1] ?
Next by Date: st: Difficulty with posting and replying
Previous by thread: st: clustering on string similarity
Next by thread: st: General Advice on robust standard errors for event study models with multiple dummy variables
Index(es):
- Date
- Thread