Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: expanding data set by variable


From   Nick Cox <[email protected]>
To   [email protected]
Subject   Re: st: expanding data set by variable
Date   Wed, 9 May 2012 00:04:54 +0100

I'd turn this round and ask

1. What the date variables are (string, numeric, numeric with a date format)?

2. Why you think the second data structure is going to be a good one?

If this were my data, I would get a different structure this way:

clear
input  ID          double (start         end)                  user
str1    type
 1         20071001      20071010        1       A
 2         20071003      20071231        1       A
 3         20071009      20080214        1       A
 4         20080117      20080117        1       B
 5         20070306      20070308        2       A
 6         20070314      20070319        2       A
 7         20070314      20070316        2       A
end

gen mydate = date(string(start, "%12.0f"), "YMD")
gen mydate2 = date(string(end, "%12.0f"), "YMD")
format mydate %td
gen length = mydate2 - mydate + 1
expand length
bysort ID : replace mydate = mydate + _n - 1
drop mydate2 length
edit

Nick

On Tue, May 8, 2012 at 9:49 PM, KOTa <[email protected]> wrote:

> i need help with creating from each observation with start and end
> variables few several according to values of start and end i.e. to
> split observations so there would be no partial (full overlap is ok)
> time overlap between them, preserving all other variables the same
>
> sample of data:
>
> ID                start         end                  user       type
> 1         20071001      20071010        1       A
> 2         20071003      20071231        1       A
> 3         20071009      20080214        1       A
> 4         20080117      20080117        1       B
> 5         20070306      20070308        2       A
> 6         20070314      20070319        2       A
> 7         20070314      20070316        2       A
>
> the result i need is (from first 4)
>
> ID                start         end                  user       type
> 1         20071001      20071003        1       A
> 1         20071003      20071009        1       A
> 1         20071009      20071010        1       A
> 2         20071003      20071009        1       A
> 2         20071009      20071010        1       A
> 2         20071010      20071231        1       A
> 3         20071009      20071010        1       A
> 3         20071010      20081231        1       A
> 3         20071231      20080214        1       A
> 4         20080117      20080117        1       B
>
> i was think about counting how many overlaps there are for each
> observation (i.e. 3 for ID 1) saving it into additional variable per
> observation and then to use expand and replace start/end value.
>
> but didn't find a way how to copy observation by variable value number of times

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index