Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# RE: st: converting high frequency data to low frequency

 From David Kantor To statalist@hsphsun2.harvard.edu Subject RE: st: converting high frequency data to low frequency Date Fri, 05 Nov 2010 10:25:37 -0400

```Thank you to Nick for the correction and for bringing me up-to-date.
--David

At 07:59 AM 11/5/2010, you wrote:
```
David's suggestion strikes me as right in principle, but I think he's still thinking in terms of the bad old days before Stata 10 when people had to work out their own awkward ways of handling times of day. That's a misunderstanding here.
```
```
As always, the _format_ of these data is a matter of how they are to be displayed, and not a matter of how they are stored. (An article on the most common misunderstandings of Stata would surely include this one.)
```
```
Dimitry's data look exactly like standard Stata date-times, allowed in Stata 10 up, meaning that underneath the cosmetic format they are times in milliseconds (ms). Therefore, he wants to round in units of 1000 * 60 * 5 = 300000.
```
```
Here is a concrete example which covers everything needed to understand this problem.
```
```
Using a %tc format for a -clock()- conversion of 11:31:00 today gives us back, not surprisingly, the same information:
```
. di %tc  clock("5 Nov 2010 11:31:00", "DMYhms")
05nov2010 11:31:00

```
But underneath all that, the precise date-time _really_ is just an integer with units ms.
```
. di %20.0f  clock("5 Nov 2010 11:31:00", "DMYhms")
1604575860000

(The "20" in the format is much more than I need but causes no problem here.)

```
You can round down or round up; which way you go is a matter of taste or convention. I almost never round using -int()-. I almost always round using -floor()- or -ceil()- because then I know immediately that I am rounding down (-floor()-) or up (-ceil()-; think ceiling) and I don't get bit around 0 because the way -int()- works with negative numbers is not what I usually want, except that I might forget that or not foresee it might happen with my data.
```
```
Now rounding down, for example, in units of 5 minutes is rounding down in units of 300000 ms. There are three steps, except that they can be combined in one line:
```
1. Divide by 300000.

2. Round down to the next integer below.

3. Multiply by 300000.

So, the result is another large integer,

. di %20.0f  300000 * floor(clock("5 Nov 2010 11:31:00", "DMYhms")/300000)
1604575800000

But we should check that we did it right:

. di %tc  300000 * floor(clock("5 Nov 2010 11:31:00", "DMYhms")/300000)
05nov2010 11:30:00

With a variable it's going to be

gen double binnedtime = 300000 * floor(ordertime/300000)
format binnedtime %tc

```
Never forget the -double-. Then you can -collapse- (or better -contract-) in terms of the new variable. (If it's really just time of day you care about, you must get there first by subtraction.)
```
```
(I suggested generalising -floor()- and -ceil()- some years ago to StataCorp so that with two arguments -floor(ordertime, 300000), say, would do what is above, but the suggestion is still lurking in their files. A good argument against would be that the long-winded way to do it, as above, is easy enough.)
```

SJ-3-4  dm0002  . . . . . . . . Stata tip 2: Building with floors and ceilings
```
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. J. Cox Q4/03 SJ 3(4):446--447 (no commands)
```        tips for using floor() and ceil()

Nick
n.j.cox@durham.ac.uk
[...]
```
```
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```