Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: RE: RE: RE: Calculating the changing proportion of a population with a certain property over time


From   "Nick Cox" <[email protected]>
To   <[email protected]>
Subject   st: RE: RE: RE: RE: Calculating the changing proportion of a population with a certain property over time
Date   Mon, 12 Dec 2005 23:26:43 -0000

sort time 
gen SUMprop = sum(first) 
line SUMprop time 

is probably a better graph for most 
purposes. 

Nick 
[email protected] 

> That's clearer. Thanks. 
> 
> Given the one-way ticket, you just need 
> by monitor first occurrences. 
> 
> The first occurrence is the only time 
> that each individual's cumulative sum of -prop- is 
> 1. Before that it is always 0; afterwards 
> 2 or more. 
> 
> bysort id (time) : gen first = sum(prop) == 1 
> 
> Then we just track the number of first 
> occurrences as a function of time. 
> 
> bysort first (time) : gen sumprop = sum(first) 
> line sumprop time if first 
> 
> I am not clear that you need to fill in all
> the times, but that could be done. 
> 
> Nick 
> [email protected] 
> 
> Steinar Fossedal
>  
> > Thanks Nick, 
> > 
> > however I suspect the -egen- command you mention would only 
> calculate
> > generate the means for observations with a record at the 
> specific time
> > step. If I could make it calculate the mean using the 
> lowest time step
> > equal to or above the step we're trying to calculate, it 
> > would solve my
> > problem though (the property is sticky, it's a one-way 
> ticket). I know
> > MS Excel has options for this using -vlookup/hlookup-, but 
> the dataset
> > won't fit in Excel.
> > -lowess- could be usable if it smoothed over time intervals 
> instead of
> > records, but I can't see how to make it do so.
> > 
> > The typical structure of my data is something like
> > 
> > ID	Time	prop
> > 1	1	0
> > 1	2	0
> > 1	4	0
> > 1	5	0
> > 1	6	1
> > 1	60	1
> > 2	1	0
> > 2	2	0
> > 2	3	1
> > 2	48	1
> > 
> > Notice the jumps in timespan. Smoothing within a window of records
> > instead of time would produce quite different results - unless, of
> > course, I could somehow add the extra records (from 10 
> > through 59 for ID
> > 1 in the example). This would solve the problems using -egen- 
> > too. From
> > the example above, the result I'm looking for would be 
> something like
> > 
> > Time	Sumprop
> > 1	0
> > 2	0
> > 3	1
> > 4	1
> > 5	1
> > 6	2
> > ...
> > 60	2
> 
> Nick Cox
> 
> > Create a variable 
> > 
> > gen is_one = prop == 1 
> > 
> > and
> > 
> > lowess is_one time 
> > 
> > egen mean_is_one = mean(is_one), by(time) 
> > 
> > etc. 
> > 
> > Nick 
> > [email protected] 
> > 
> > Steinar Fossedal
> >  
> > > I have a survival time dataset with customer information, and 
> > > I want to
> > > create a plot which shows the proportion of the population with a
> > > certain nominal property as it changes over time. Thus I 
> > would like to
> > > calculate the number of customers with the property at each 
> > time, and
> > > divide it to the number of total customers (or customers 
> > with another
> > > interesting property). Since there is not a record at each 
> > time t for
> > > every customer, I can't simply calculate it from the 
> > records directly.
> > > (- count if prop==1 & time==9 - would miss customers which got the
> > > property at time 8)
> > > 
> > > Any suggestions as to how I can do this? I played with the idea to
> > > create records for all time intervals, but I can't seem to 
> > > find an easy
> > > way to duplicate observations either.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index