That's clearer. Thanks.
Given the one-way ticket, you just need
by monitor first occurrences.
The first occurrence is the only time
that each individual's cumulative sum of -prop- is
1. Before that it is always 0; afterwards
2 or more.
bysort id (time) : gen first = sum(prop) == 1
Then we just track the number of first
occurrences as a function of time.
bysort first (time) : gen sumprop = sum(first)
line sumprop time if first
I am not clear that you need to fill in all
the times, but that could be done.
Nick
[email protected]
Steinar Fossedal
> Thanks Nick,
>
> however I suspect the -egen- command you mention would only calculate
> generate the means for observations with a record at the specific time
> step. If I could make it calculate the mean using the lowest time step
> equal to or above the step we're trying to calculate, it
> would solve my
> problem though (the property is sticky, it's a one-way ticket). I know
> MS Excel has options for this using -vlookup/hlookup-, but the dataset
> won't fit in Excel.
> -lowess- could be usable if it smoothed over time intervals instead of
> records, but I can't see how to make it do so.
>
> The typical structure of my data is something like
>
> ID Time prop
> 1 1 0
> 1 2 0
> 1 4 0
> 1 5 0
> 1 6 1
> 1 60 1
> 2 1 0
> 2 2 0
> 2 3 1
> 2 48 1
>
> Notice the jumps in timespan. Smoothing within a window of records
> instead of time would produce quite different results - unless, of
> course, I could somehow add the extra records (from 10
> through 59 for ID
> 1 in the example). This would solve the problems using -egen-
> too. From
> the example above, the result I'm looking for would be something like
>
> Time Sumprop
> 1 0
> 2 0
> 3 1
> 4 1
> 5 1
> 6 2
> ...
> 60 2
Nick Cox
> Create a variable
>
> gen is_one = prop == 1
>
> and
>
> lowess is_one time
>
> egen mean_is_one = mean(is_one), by(time)
>
> etc.
>
> Nick
> [email protected]
>
> Steinar Fossedal
>
> > I have a survival time dataset with customer information, and
> > I want to
> > create a plot which shows the proportion of the population with a
> > certain nominal property as it changes over time. Thus I
> would like to
> > calculate the number of customers with the property at each
> time, and
> > divide it to the number of total customers (or customers
> with another
> > interesting property). Since there is not a record at each
> time t for
> > every customer, I can't simply calculate it from the
> records directly.
> > (- count if prop==1 & time==9 - would miss customers which got the
> > property at time 8)
> >
> > Any suggestions as to how I can do this? I played with the idea to
> > create records for all time intervals, but I can't seem to
> > find an easy
> > way to duplicate observations either.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/