Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: converting high frequency data to low frequency

From   David Kantor <>
Subject   Re: st: converting high frequency data to low frequency
Date   Thu, 04 Nov 2010 20:45:14 -0400

At 06:58 PM 11/4/2010, Dimitry wrote:
I have data on orders that looks like this:

store_id        ordertime
1901    10oct2010 15:48:18
1901    10oct2010 15:50:36
1901    10oct2010 16:41:35
1901    10oct2010 16:53:37
1901    10oct2010 18:38:24
1901    10oct2010 19:09:22
1901    10oct2010 21:14:22
1906    10oct2010 17:54:59
1906    12oct2010 19:21:27
1906    17oct2010 16:10:01
1906    17oct2010 19:34:39
1906    17oct2010 21:11:00
1906    17oct2010 22:18:55
1906    17oct2010 22:55:13
1906    19oct2010 22:23:10
1906    20oct2010 01:19:24

Ordertime is in %tc format. I would like to count the number of orders
in 5 minute buckets by store_id. I would like to use

collapse (count) ordertime, by(store_id bucket)

but I have no idea how to group the ordertime into 5 minute intervals
that are nicely labeled. I imagine I could calculate the elapsed time
since midnight for each day, divide that into 5 minute intervals (or
6000 unit intervals), and find some way to label, but I have not
gotten this to work yet. Is there's an easier way? I tried googling
and findit-ing, but I am not sure I know what this type of
transformation is called.

I believe you need to do just what you stated. In particular, if h, m , and s represent the hours, minutes and seconds, then you want
 gen long s2 = 3600 * h + 60 * m + s

then take int(s2/300)

(You may want to s2 to include date*24*3600 -- if you don't use the date as a separate component of the identifier.)

There may be other methods available that I'm not aware of.

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index