Dear Statalisters I am currently trying to analyse a data set on firm survival. I have read up on various sources how to transform the data into the appropriate survival analysis format. Unfortunately I don't know anybody familiar with the topic of survival analysis, so I don't know if what I've done so far is really correct. If expirienced survival data analysts could have a glance at my approach and comment that would be great. Here is a scetch of what my dataset looks like: id year X failure establishment 1 1981 X11 1 1977 2 2000 X21 0 1999 2 2001 X22 0 1999 2 2002 X23 0 1999 3 1981 X31 1 1980 4 1980 X41 0 1979 4 1981 X42 0 1979 4 1989 X43 0 1979 4 1990 X44 1 1979 5 1992 X45 0 1987 5 1995 X51 1 1987 6 1983 X61 0 1982 6 1984 X62 0 1982 6 1985 X63 1 1982 So there is left truncation, right censoring and possibly gaps within an id. Continous time analysis: The commands I used to -snapspan- and -stset- the data set are: g begin=year-1 snapspan id year failure, g(begin_span) replace stset year, id(id) time0(begin) origin(time establishment) f(failure) Am I making any (obvious) mistakes here? In particular, I am not absolutely sure if my 'time0()' definition is ok. I've tried to define a variable within the 'snapspanning process'(i.e. begin_span) but Stata does not recognise the gaps in that case. Discrete time analysis: My main question here is whether I can include the firms with gaps into a cloglog analysis or not (given I brought the data into an appropriate format for analysing a cloglog model). Thanks for any tips or comments Mat * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

