Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: On the analysis of intraday data


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   RE: st: On the analysis of intraday data
Date   Tue, 2 Mar 2004 10:35:52 -0000

William Gould replied to Levy Lee

> > [...] I don't know if stata has some builtin functions to 
> handle time
> > variables, as it does on date variables. I have a variables 
> looks like the
> > following:
> >
> >
> >    Obs. No.                            Time
> >       1                          2003-03-03 09:30:29
> >       2                          2003-03-03 09:30:47
> >       3                          2003-03-03 09:31:11
> >        
> > [...]  I wish there is some buildin function so that we can 
> calculate
> > 2003-03-03 09:30:29 minus 2003-03-03 09:30:47 will equals 
> to 28 seconds.
> 
> There is no built-in function, but obtaining the desired 
> result should not 
> be too difficult.  Although this is "intraday" data, let's 
> write our code 
> so that, were times interday, we would still get the right answer.
> 
>             ----+----1----+----2----+----3----+----4
>             2003-03-03 09:30:29
> 
>         . gen str date = substr(time, 1, 10) 
>         . assert substr(time,11,1)==" "
>         . gen hour = real(substr(time,12,2))
>         . assert substr(time,14,1)==":"
>         . gen min  = real(substr(time,15,2))
>         . assert substr(time,17,1)==":"
> 	. gen sec  = real(substr(time,18,2))
> 
> Now that I've got the components -- and I have established that all
> observations have the expected format, we can make the calculation:
> 
>         . gen edate = date(date, "ymd") 
>         . gen double secs = date*24*60*60 + hour*60*60 + min*60 + sec
> 
> Variable -secs- now contains the number of seconds since 
> 01jan1960 00:00:00.
> Note that I have stored -secs- as a double.  That is important.

It may be of interest to spell out the similarity between 
this solution and one using -split-. -substr()- and -real()- 
and -date()- are all important functions that almost everyone 
needs eventually in their Stata work. -split- is by 
contrast a convenience command based on -substr()-, so we 
are talking about the same solution, approached from zeroth 
principles or a convenience command. 

In the case of -time-, with values like "2003-03-03 09:30:29"
we want to -split- on the space and the colons, and we want
to get numeric values where possible, so 

. split time, p(" " :) destring 

This should yield 

time1 day (cannot -destring-) 
time2 hours (should be able to -destring-) 
time3 minutes (same) 
time4 seconds (same) 

Look at -split-'s output for signs of problems, 
or use -assert- to test. 

Then  the last steps are very similar: 

. gen edate = date(time1, "ymd") 
. gen double secs = edate*24*60*60 + time2*60*60 + time3*60 + time4

The last corrects a typo in Bill's last line ("date" should be 
"edate"). 

Nick Cox 

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index