sort time
gen SUMprop = sum(first)
line SUMprop time
is probably a better graph for most
purposes.
Nick
[email protected]
> That's clearer. Thanks.
>
> Given the one-way ticket, you just need
> by monitor first occurrences.
>
> The first occurrence is the only time
> that each individual's cumulative sum of -prop- is
> 1. Before that it is always 0; afterwards
> 2 or more.
>
> bysort id (time) : gen first = sum(prop) == 1
>
> Then we just track the number of first
> occurrences as a function of time.
>
> bysort first (time) : gen sumprop = sum(first)
> line sumprop time if first
>
> I am not clear that you need to fill in all
> the times, but that could be done.
>
> Nick
> [email protected]
>
> Steinar Fossedal
>
> > Thanks Nick,
> >
> > however I suspect the -egen- command you mention would only
> calculate
> > generate the means for observations with a record at the
> specific time
> > step. If I could make it calculate the mean using the
> lowest time step
> > equal to or above the step we're trying to calculate, it
> > would solve my
> > problem though (the property is sticky, it's a one-way
> ticket). I know
> > MS Excel has options for this using -vlookup/hlookup-, but
> the dataset
> > won't fit in Excel.
> > -lowess- could be usable if it smoothed over time intervals
> instead of
> > records, but I can't see how to make it do so.
> >
> > The typical structure of my data is something like
> >
> > ID Time prop
> > 1 1 0
> > 1 2 0
> > 1 4 0
> > 1 5 0
> > 1 6 1
> > 1 60 1
> > 2 1 0
> > 2 2 0
> > 2 3 1
> > 2 48 1
> >
> > Notice the jumps in timespan. Smoothing within a window of records
> > instead of time would produce quite different results - unless, of
> > course, I could somehow add the extra records (from 10
> > through 59 for ID
> > 1 in the example). This would solve the problems using -egen-
> > too. From
> > the example above, the result I'm looking for would be
> something like
> >
> > Time Sumprop
> > 1 0
> > 2 0
> > 3 1
> > 4 1
> > 5 1
> > 6 2
> > ...
> > 60 2
>
> Nick Cox
>
> > Create a variable
> >
> > gen is_one = prop == 1
> >
> > and
> >
> > lowess is_one time
> >
> > egen mean_is_one = mean(is_one), by(time)
> >
> > etc.
> >
> > Nick
> > [email protected]
> >
> > Steinar Fossedal
> >
> > > I have a survival time dataset with customer information, and
> > > I want to
> > > create a plot which shows the proportion of the population with a
> > > certain nominal property as it changes over time. Thus I
> > would like to
> > > calculate the number of customers with the property at each
> > time, and
> > > divide it to the number of total customers (or customers
> > with another
> > > interesting property). Since there is not a record at each
> > time t for
> > > every customer, I can't simply calculate it from the
> > records directly.
> > > (- count if prop==1 & time==9 - would miss customers which got the
> > > property at time 8)
> > >
> > > Any suggestions as to how I can do this? I played with the idea to
> > > create records for all time intervals, but I can't seem to
> > > find an easy
> > > way to duplicate observations either.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/