Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: expanding data set by variable

From	Nick Cox <[email protected]>
To	[email protected]
Subject	Re: st: expanding data set by variable
Date	Wed, 9 May 2012 00:04:54 +0100

I'd turn this round and ask

1. What the date variables are (string, numeric, numeric with a date format)?

2. Why you think the second data structure is going to be a good one?

If this were my data, I would get a different structure this way:

clear
input  ID          double (start         end)                  user
str1    type
 1         20071001      20071010        1       A
 2         20071003      20071231        1       A
 3         20071009      20080214        1       A
 4         20080117      20080117        1       B
 5         20070306      20070308        2       A
 6         20070314      20070319        2       A
 7         20070314      20070316        2       A
end

gen mydate = date(string(start, "%12.0f"), "YMD")
gen mydate2 = date(string(end, "%12.0f"), "YMD")
format mydate %td
gen length = mydate2 - mydate + 1
expand length
bysort ID : replace mydate = mydate + _n - 1
drop mydate2 length
edit

Nick

On Tue, May 8, 2012 at 9:49 PM, KOTa <[email protected]> wrote:

> i need help with creating from each observation with start and end
> variables few several according to values of start and end i.e. to
> split observations so there would be no partial (full overlap is ok)
> time overlap between them, preserving all other variables the same
>
> sample of data:
>
> ID                start         end                  user       type
> 1         20071001      20071010        1       A
> 2         20071003      20071231        1       A
> 3         20071009      20080214        1       A
> 4         20080117      20080117        1       B
> 5         20070306      20070308        2       A
> 6         20070314      20070319        2       A
> 7         20070314      20070316        2       A
>
> the result i need is (from first 4)
>
> ID                start         end                  user       type
> 1         20071001      20071003        1       A
> 1         20071003      20071009        1       A
> 1         20071009      20071010        1       A
> 2         20071003      20071009        1       A
> 2         20071009      20071010        1       A
> 2         20071010      20071231        1       A
> 3         20071009      20071010        1       A
> 3         20071010      20081231        1       A
> 3         20071231      20080214        1       A
> 4         20080117      20080117        1       B
>
> i was think about counting how many overlaps there are for each
> observation (i.e. 3 for ID 1) saving it into additional variable per
> observation and then to use expand and replace start/end value.
>
> but didn't find a way how to copy observation by variable value number of times

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: expanding data set by variable
  - From: KOTa <[email protected]>

References:
- st: expanding data set by variable
  - From: KOTa <[email protected]>

Prev by Date: st: RE: correlation table for fixed effects negative binomial model
Next by Date: RE: st: Multiple imputation with incidental selection
Previous by thread: st: expanding data set by variable
Next by thread: Re: st: expanding data set by variable
Index(es):
- Date
- Thread