That's clearer. Thanks. Given the one-way ticket, you just need by monitor first occurrences. The first occurrence is the only time that each individual's cumulative sum of -prop- is 1. Before that it is always 0; afterwards 2 or more. bysort id (time) : gen first = sum(prop) == 1 Then we just track the number of first occurrences as a function of time. bysort first (time) : gen sumprop = sum(first) line sumprop time if first I am not clear that you need to fill in all the times, but that could be done. Nick [email protected] Steinar Fossedal > Thanks Nick, > > however I suspect the -egen- command you mention would only calculate > generate the means for observations with a record at the specific time > step. If I could make it calculate the mean using the lowest time step > equal to or above the step we're trying to calculate, it > would solve my > problem though (the property is sticky, it's a one-way ticket). I know > MS Excel has options for this using -vlookup/hlookup-, but the dataset > won't fit in Excel. > -lowess- could be usable if it smoothed over time intervals instead of > records, but I can't see how to make it do so. > > The typical structure of my data is something like > > ID Time prop > 1 1 0 > 1 2 0 > 1 4 0 > 1 5 0 > 1 6 1 > 1 60 1 > 2 1 0 > 2 2 0 > 2 3 1 > 2 48 1 > > Notice the jumps in timespan. Smoothing within a window of records > instead of time would produce quite different results - unless, of > course, I could somehow add the extra records (from 10 > through 59 for ID > 1 in the example). This would solve the problems using -egen- > too. From > the example above, the result I'm looking for would be something like > > Time Sumprop > 1 0 > 2 0 > 3 1 > 4 1 > 5 1 > 6 2 > ... > 60 2 Nick Cox > Create a variable > > gen is_one = prop == 1 > > and > > lowess is_one time > > egen mean_is_one = mean(is_one), by(time) > > etc. > > Nick > [email protected] > > Steinar Fossedal > > > I have a survival time dataset with customer information, and > > I want to > > create a plot which shows the proportion of the population with a > > certain nominal property as it changes over time. Thus I > would like to > > calculate the number of customers with the property at each > time, and > > divide it to the number of total customers (or customers > with another > > interesting property). Since there is not a record at each > time t for > > every customer, I can't simply calculate it from the > records directly. > > (- count if prop==1 & time==9 - would miss customers which got the > > property at time 8) > > > > Any suggestions as to how I can do this? I played with the idea to > > create records for all time intervals, but I can't seem to > > find an easy > > way to duplicate observations either. * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

