Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Ulrich Kohler <kohler@wzb.eu> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: sequence analysis |

Date |
Thu, 05 Jan 2012 14:06:50 +0100 |

Am Donnerstag, den 05.01.2012, 20:09 +1000 schrieb Melanie Spallek: > Hi, I'm trying to do a sequence analysis using housing trajectories (tenure status). I'm using sqset tenure id wave, trim to define my sequences. Followed by seqtab, so, I receive a frequency table of the sequences, however the so option treats identically all sequences that have the same order of elements; i.e., the sequence A-B-B-A would be treated the same as A-B-A-A, which is exactly what I want. My first (and most important) question is if and how I can save those sequences in a variable. Since my dataset is in long format, I'm aware that by having ten waves for each id, the 'sequence variable' will be occurring ten times, which is ok. Brendan Halpin has answered the question. Apart from that, have you checked the sq-egen function -sqfreq()-? It generates a variable holding the frequencies reported by -sqtab-. This could be helpful in case that the end-goal of this is exercise ist to get those frequencies as a variable. If you want to create a "SO"-sequence dataset this could be done as follows: Starting from a sequence dataset in long format such as . use http://www2000.wzb.eu/~kohler/ado/youthemp, clear . reshape long st, i(id) j(order) . list, sepby(id) you could type . by id (order), sort: gen first = st!=st[_n-1] . keep if first so that you arrive at: . list, sepby(id) You can then create a new order variable an sq-reset your data: . by id(order): gen soorder = _n . sqset > My second question is the following. I have used the keeplongest option to define the sequences, and when every time I run exactly the same code followed by seqtab, I get slightly different results. I'm thinking that this might be due to STATA randomly selecting different consecutive sequences, eg, A-A-A-.-.-.-B-B-B, so if I run it once, keeplongest might select the A-A-A sequence, and the next time it might select the B-B-B sequence, as both are the same length, and hence I get slightly different frequencies for the A-A-A and B-B-B sequencies....Has anybody got any other explanation??? No. That explanation is quite correct. I have sent a fix to Kit Baum that corrects this. The correction is such that the last block of all blocks of equal length is allways retained. If I find time, I will let the user specify whether it is the first, the last or a random selection. Uli * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: How to solve "repeated time values" error message for panel dataset***From:*Nataliya Acc <p99asn01@yahoo.com>

**Re: st: How to solve "repeated time values" error message for panel dataset***From:*Maarten Buis <maartenlbuis@gmail.com>

**st: sequence analysis***From:*Melanie Spallek <m.spallek@uq.edu.au>

- Prev by Date:
**RE: st: Meta analysis in single group,** - Next by Date:
**st: xtreg - saving transformed variables** - Previous by thread:
**Re: st: sequence analysis** - Next by thread:
**Re: st: How to solve "repeated time values" error message for panel dataset** - Index(es):