Dear Stata Users, I am very new to survival analysis, and I had a couple of questions regarding piecewise-constant exponential models. (I am using the stpiece command) My data set is basically the history of an industry since it's beginning. An observation is at the firm-year level. I am trying to estimate the impact of an event on the hazard of exit of the firms that do suffer the event, as opposed to the ones that do not. For each firm, I have entry and exit (death) year and I can observe the year when the event happens. The two issues that I am facing are the following: a. I have a large number of firms that exist only for one year. When I use the "stset" command to declare the dataset for survival analysis, those observations are dropped (i.e. _st==0). This is an issue because these are more than 1000 firms in a sample of 2806. Note that the event of interest does *not* happen for any of the firms that are not used in the analysis. So basically, these are the complete failures of the industry. b. The second problem I have is that some firm suffer two events in the same yea. This is an important issue, because these two events differ on the characteristics thus, it is not clear which one to keep and which one to drop for a given year when there are two of them. c. My last question concerns the errors. I usually work with panel data techniques, where errors are clustered at the firm level. Would this be appropriate clustering for the survival analysis as well? I could alternatively cluster on founding state or founding city. I guess my question is whether the clustering reasoning for panel data, transfers to survival analysis when we are talking about firms. As a solution to problems a. and b. above, I considered transforming the data to semi-annual (rather than annual) observations. For problem (a) above, this would mean that I will assume that firms that failed within a year did survive for one semester of that year. For problem (b) I can make the simplifying assumption that one event occurs in the first 6-month period of the year, and the second event occurs in the second 6-month period of the year, and thus allow for two events within one year. Thus, d. Is the procedure I am proposing (i.e. transforming annual data to sem-annual data) inherently wrong? Is there any econometric issue why this should not be done? Do the assumptions that I am making seem too strong? e. Is there any other way to fix issues (a) and (b) above? Thank you in advance for your help. yiannis. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

