Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Nick Cox <njcoxstata@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Create Timeline based on Dates |

Date |
Mon, 29 Oct 2012 18:30:53 +0000 |

Thanks for the closure! On Mon, Oct 29, 2012 at 5:58 PM, Lisa Wang <lhwang0925@gmail.com> wrote: > Dear Nick, > > I did a combination of the two suggestions (dow and then also the > forward/backward time differences) and it worked brilliantly. Exactly > as I wanted!! > > Thank you so much for your time and effort. You are my rescuer! > > Best regards, > Lisa > > On Fri, Oct 26, 2012 at 8:18 PM, Nick Cox <njcoxstata@gmail.com> wrote: >> On weekends: >> >> It seems that you have Mondays to Fridays only in your data, and you >> want time to run in sequence so that Monday follows Friday. >> >> The -dow()- function returns 1 to 5 for Mondays to Fridays. A bit of >> messing around yields a mapping to sequential "dates" that omit >> Saturdays and Sundays. (If you want another origin, you can just shift >> it.) >> >> Again, I need a sandpit. >> >> clear >> set obs 14 >> gen date = mdy(10, 20, 2012) + _n >> format date %td >> gen dow = dow(date) >> >> . l >> >> +-----------------+ >> | date dow | >> |-----------------| >> 1. | 21oct2012 0 | >> 2. | 22oct2012 1 | >> 3. | 23oct2012 2 | >> 4. | 24oct2012 3 | >> 5. | 25oct2012 4 | >> |-----------------| >> 6. | 26oct2012 5 | >> 7. | 27oct2012 6 | >> 8. | 28oct2012 0 | >> 9. | 29oct2012 1 | >> 10. | 30oct2012 2 | >> |-----------------| >> 11. | 31oct2012 3 | >> 12. | 01nov2012 4 | >> 13. | 02nov2012 5 | >> 14. | 03nov2012 6 | >> +-----------------+ >> >> gen seqdate = (5 * (date - dow(date) - 2) / 7) + dow(date) >> replace seqdate = . if inlist(dow(date), 0, 6) >> >> . l >> >> +---------------------------+ >> | date dow seqdate | >> |---------------------------| >> 1. | 21oct2012 0 . | >> 2. | 22oct2012 1 13776 | >> 3. | 23oct2012 2 13777 | >> 4. | 24oct2012 3 13778 | >> 5. | 25oct2012 4 13779 | >> |---------------------------| >> 6. | 26oct2012 5 13780 | >> 7. | 27oct2012 6 . | >> 8. | 28oct2012 0 . | >> 9. | 29oct2012 1 13781 | >> 10. | 30oct2012 2 13782 | >> |---------------------------| >> 11. | 31oct2012 3 13783 | >> 12. | 01nov2012 4 13784 | >> 13. | 02nov2012 5 13785 | >> 14. | 03nov2012 6 . | >> >> If you are now going to tell me that the schools have >> {holidays|vacations} too, then you really need a business calendar. >> >> >> On Fri, Oct 26, 2012 at 9:36 AM, Nick Cox <njcoxstata@gmail.com> wrote: >>> For problems like this I need a sandpit to play in. Here is one I made: >>> >>> list, sepby(id) >>> >>> +-------------------+ >>> | id date event | >>> |-------------------| >>> 1. | 1 13 0 | >>> 2. | 1 13 0 | >>> 3. | 1 14 1 | >>> 4. | 1 15 0 | >>> 5. | 1 17 0 | >>> 6. | 1 18 1 | >>> 7. | 1 19 0 | >>> |-------------------| >>> 8. | 2 14 0 | >>> 9. | 2 15 0 | >>> 10. | 2 15 1 | >>> 11. | 2 17 0 | >>> 12. | 2 18 0 | >>> 13. | 2 19 1 | >>> 14. | 2 20 0 | >>> +-------------------+ >>> >>> I see this as follows. >>> >>> 1. There is a date looking forward, which is (present date - previous >>> event date), and is thus zero or positive >>> >>> 1'. There is a twist on 1: There can be multiple observations with the >>> same date. >>> >>> 2. There is a date looking backward which is (present date - next >>> event date), and is thus zero or negative >>> >>> 2'. As 1'. >>> >>> 3. The wanted date is the smaller in absolute value. If there is a tie >>> in absolute value, I choose the positive value. >>> >>> 4. For dates before the first event, no previous date can be >>> identified. But this is not a problem, as the backward date will be >>> the solution for these dates. >>> >>> 4. For dates after the last event, no next date can be identified. But >>> this is not a problem, as the forward date will be the solution for >>> these dates. >>> >>> To get "forward dates", we just copy previous values as needed, after >>> spreading each event to all dates that are the same: >>> >>> gen prev = date if event == 1 >>> bysort id date (prev) : replace prev = prev[1] if prev[1] == 1 >>> bysort id (date) : replace prev = prev[_n-1] if mi(prev) >>> gen forward = date - prev >>> >>> To get "backward dates", we can use the trick of reversing time. >>> >>> gen negdate = -date >>> gen next = date if event == 1 >>> bysort id negdate (next) : replace next = next[1] if next[1] == 1 >>> bysort id (negdate) : replace next = next[_n-1] if mi(next) >>> gen backward = date - next >>> >>> Now can we do the comparison: >>> . >>> gen timeline = cond(abs(forward) <= abs(backward), forward, backward) >>> sort id date >>> >>> list id date forw backw timeline, sepby(id) >>> >>> +-------------------------------------------+ >>> | id date forward backward timeline | >>> |-------------------------------------------| >>> 1. | 1 13 . -1 -1 | >>> 2. | 1 13 . -1 -1 | >>> 3. | 1 14 0 0 0 | >>> 4. | 1 15 1 -3 1 | >>> 5. | 1 17 3 -1 -1 | >>> 6. | 1 18 0 0 0 | >>> 7. | 1 19 1 . 1 | >>> |-------------------------------------------| >>> 8. | 2 14 . -1 -1 | >>> 9. | 2 15 0 0 0 | >>> 10. | 2 15 0 0 0 | >>> 11. | 2 17 2 -2 2 | >>> 12. | 2 18 3 -1 -1 | >>> 13. | 2 19 0 0 0 | >>> 14. | 2 20 1 . 1 | >>> +-------------------------------------------+ >>> >>> For the complication with weekends, Stata offers business calendars as >>> a complete solution. I have never used them. >>> >>> On Fri, Oct 26, 2012 at 1:02 AM, Lisa Wang <lhwang0925@gmail.com> wrote: >>> >>>> I would like to create a timeline based on some event date (ie. ...-5, >>>> -4, -3, -2, -1, 0, +1, +2, +3...etc). I have different students names >>>> in a variable named "as" (column 1) and also a set of dates (column >>>> 2) as well as another variable 'edate' (column 3) which has the event >>>> dates and . everywhere else if it didn't match with column 2. What I >>>> would like to know is how to create the timeline with the event date >>>> being 0 for each student. >>>> >>>> This is the code I have run so far: >>>> >>>> - bysort as: generate rank =_n >>>> >>>> . bysort as: generate erank = rank if date==edate >>>> >>>> . bysort as: egen erank_pop = min(erank) >>>> >>>> . bysort as: generate t = rank -erank_pop - >>>> >>>> There are three problems which have me now stuck. >>>> >>>> 1. I might have multiple observations for a particular student on the >>>> same date as well. Therefore, when I run the first line of code, it's >>>> already erroneous as Stata will treat it as being different dates. I >>>> tried also -bysort as(date): generate rank =_n - instead but it >>>> returns an error: "factor variables and time-series operators not >>>> allowed". >>>> >>>> 2. Sometimes I have multiple event dates for a particular student - I >>>> would like Stata to guess which event date the date is closer to and >>>> then do the time differences from that. >>>> >>>> 3. The dates in column 2 have all weekdays but no weekends (as the >>>> students don't need to go to school on those days), so if I do a >>>> timeline then it will skip some dates (eg. -5 then to -2,-1 etc. as a >>>> result of the weekend). How would I overcome this, so that it actually >>>> is -3,-2,-1 etc? * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Create Timeline based on Dates***From:*Lisa Wang <lhwang0925@gmail.com>

**Re: st: Create Timeline based on Dates***From:*Nick Cox <njcoxstata@gmail.com>

**Re: st: Create Timeline based on Dates***From:*Nick Cox <njcoxstata@gmail.com>

**Re: st: Create Timeline based on Dates***From:*Lisa Wang <lhwang0925@gmail.com>

- Prev by Date:
**Re: st: Average If with Multiple Conditions** - Next by Date:
**Re: st: Lag operators on panel data inside -bysort-** - Previous by thread:
**Re: st: Create Timeline based on Dates** - Next by thread:
**st: Subtract Closest Cell Which has A Value** - Index(es):