Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
David Kantor <kantor.d@att.net> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
RE: st: converting high frequency data to low frequency |

Date |
Fri, 05 Nov 2010 10:25:37 -0400 |

Thank you to Nick for the correction and for bringing me up-to-date. --David At 07:59 AM 11/5/2010, you wrote:

David's suggestion strikes me as right in principle, but I thinkhe's still thinking in terms of the bad old days before Stata 10when people had to work out their own awkward ways of handlingtimes of day. That's a misunderstanding here.As always, the _format_ of these data is a matter of how they are tobe displayed, and not a matter of how they are stored. (An articleon the most common misunderstandings of Stata would surely include this one.)Dimitry's data look exactly like standard Stata date-times, allowedin Stata 10 up, meaning that underneath the cosmetic format they aretimes in milliseconds (ms). Therefore, he wants to round in units of1000 * 60 * 5 = 300000.Here is a concrete example which covers everything needed tounderstand this problem.Using a %tc format for a -clock()- conversion of 11:31:00 todaygives us back, not surprisingly, the same information:. di %tc clock("5 Nov 2010 11:31:00", "DMYhms") 05nov2010 11:31:00But underneath all that, the precise date-time _really_ is just aninteger with units ms.. di %20.0f clock("5 Nov 2010 11:31:00", "DMYhms") 1604575860000 (The "20" in the format is much more than I need but causes no problem here.)You can round down or round up; which way you go is a matter oftaste or convention. I almost never round using -int()-. I almostalways round using -floor()- or -ceil()- because then I knowimmediately that I am rounding down (-floor()-) or up(-ceil()-; think ceiling) and I don't get bit around 0 because theway -int()- works with negative numbers is not what I usually want,except that I might forget that or not foresee it might happen with my data.Now rounding down, for example, in units of 5 minutes is roundingdown in units of 300000 ms. There are three steps, except that theycan be combined in one line:1. Divide by 300000. 2. Round down to the next integer below. 3. Multiply by 300000. So, the result is another large integer, . di %20.0f 300000 * floor(clock("5 Nov 2010 11:31:00", "DMYhms")/300000) 1604575800000 But we should check that we did it right: . di %tc 300000 * floor(clock("5 Nov 2010 11:31:00", "DMYhms")/300000) 05nov2010 11:30:00 With a variable it's going to be gen double binnedtime = 300000 * floor(ordertime/300000) format binnedtime %tcNever forget the -double-. Then you can -collapse- (or better-contract-) in terms of the new variable. (If it's really just timeof day you care about, you must get there first by subtraction.)(I suggested generalising -floor()- and -ceil()- some years ago toStataCorp so that with two arguments -floor(ordertime, 300000), say,would do what is above, but the suggestion is still lurking in theirfiles. A good argument against would be that the long-winded way todo it, as above, is easy enough.)See also if desired SJ-3-4 dm0002 . . . . . . . . Stata tip 2: Building with floors and ceilings. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. N. J. CoxQ4/03 SJ3(4):446--447 (no commands)tips for using floor() and ceil() Nick n.j.cox@durham.ac.uk [...]

* * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: converting high frequency data to low frequency***From:*"Dimitriy V. Masterov" <dvmaster@gmail.com>

**References**:**st: converting high frequency data to low frequency***From:*"Dimitriy V. Masterov" <dvmaster@gmail.com>

**Re: st: converting high frequency data to low frequency***From:*David Kantor <kantor.d@att.net>

**RE: st: converting high frequency data to low frequency***From:*Nick Cox <n.j.cox@durham.ac.uk>

- Prev by Date:
**Re: st: New StataCorp blog** - Next by Date:
**RE: st: Formatting Shewhart UCL, LCL Axis** - Previous by thread:
**RE: st: converting high frequency data to low frequency** - Next by thread:
**Re: st: converting high frequency data to low frequency** - Index(es):