Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Merging/summing daily values across time periods in (unbalanced) panel dataset


From   Nick Cox <[email protected]>
To   "'[email protected]'" <[email protected]>
Subject   RE: st: Merging/summing daily values across time periods in (unbalanced) panel dataset
Date   Wed, 9 Mar 2011 13:11:24 +0000

Assuming that your -date_from- and -date_to- are really daily date variables, you can start with 

gen duration = date_to - date_from + 1 
expand duration 

on your master and then use the logic in 

FAQ     . . . . . . . . . . . . . . . . . . . . . . . Replacing missing values
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
        2/03    How can I replace missing values with previous or
                following nonmissing values?
                http://www.stata.com/support/faqs/data/missing.html

to replace with sequences of dates. 

Nick 
[email protected] 

Amanda Dwelley

Thanks, good point! The same daily x1 x2 apply to everyone in the
master dataset - to start, a "merge" ID could be 1 for everyone in the
master data set and all of the using data set.

Master data set looks like:
id date_from date_to  merge_id
2 07apr2009 09jun2009  1
2 10jun2009 03jul2009  1
3 16may2009 12jun2009  1
3 13jun2009 10jul2009  1
(~50,000 observations; time periods at irregular intervals)

Daily data could look like:
merge_id  date x1 x2
1  02jun2009 5.7 0
1  03jun2009 2.3 0
1  04jun2009 0 2
1  05jun2009 0 3.5
(3 yrs of daily values that are same on each day for each person in
master data set)

For example, for first observation in master data, looking for sum of
x1 from April 7-June 9, and sum of x2 within same dates.

On Tue, Mar 8, 2011 at 8:47 PM, Nick Cox <[email protected]> wrote:
> I see no information on identifiers in your daily dataset.
>
> Nick
>
> On Tue, Mar 8, 2011 at 10:15 PM, Amanda Dwelley <[email protected]> wrote:
>
>> I'm looking for suggestions on how to join and sum daily x values from one
>> dataset into time periods defined by a start date and end date in another
>> (panel) data set.
>>
>> Master data set looks like:
>> id date_from date_to
>> 1 07apr2009 09jun2009
>> 1 10jun2009 03jul2009
>> 2 13jun2009 10jul2009
>> 2 11jul2009 05aug2009
>> (~50,000 observations; time periods at irregular intervals)
>>
>> Daily data looks like:
>> date x1 x2
>> 02jun2009 5.7 0
>> 03jun2009 2.3 0
>> 04jun2009 0 2
>> 05jun2009 0 3.5
>> (3 yrs of daily values)
>>
>> As you can see the using data could be in wide or long format.
>> If I merge the wide format daily data into the master dataset, is there any
>> way to reference/look for the dates in variable names that are between the
>> start/end date, and sum associated variables? If using the long format of
>> the daily data, is there a way to sum values between start/end dates during
>> a join or merge?

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index