[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: Help with data management |

Date |
Sun, 13 Jul 2008 17:15:03 +0100 |

. Katia Bobulova asked three questions. My answers below. Nick n.j.cox@durham.ac.uk I have some problems in managing my dataset. I have a dataset with high frequency data, with observations about time, date, price, quantity and a code for the type of security. Question 1 ========== I have the time in this format: 63000 and i want to construct a "timedate" variable, which contains both the data and the time. First af all I tried to modify the format of the time typing: g hours=int(time/10000) g minutes=int((time-hours*10000)/100) g seconds=int(time-hours*10000-minutes*100) g newtime=hms(hours, minutes, seconds) format newtime%tc However, after typing hms() I receive this message: Unknown function hms() r(133); The next step would be typing something like: gen double timedate=date*24*60*60*1000+time format timedate %tcNN/DD/CCYY_HH:MM:SS Answer 1 ======== -hms()- is an -egen- function written by Kit Baum and is included in the -egenmore- package from SSC. It will _only_ work with -egen-, not with -generate-. But you don't need it. You did almost all the work yourself. Assuming that e.g. 63000 is 06:30:00 then given your variables -hours-, -minutes-, -seconds-, gen long time_in_sec = 3600 * hours + 60 * minutes + seconds Question 2 ========== I have two types of prices: price1 and price2. I would like to create a variable which takes the difference between the lowest value of price1 and the highest value of price2 for each data and time. First of all I sorted my dataset: sort date time price1 price2 then I generated the new variable: gen price3=(price1[_n]-price2[_N] & date==(26feb2008) & time==63000 But as a result I have all missing values, furthermore I have to do this for each date and time, so i was wondering if there is a way to instruct Stata to create this new variable for each time and data. Answer 2 ======== I don't understand how you reached your solution, nor why it produces missing values. I don't think your -sort- order guarantees what you want, but it looks quite wrong any way. This is likely to be closer to where you want to be. bysort date time (price1) : gen diff = price1[1] bysort date time (price2) : replace diff = price2[_N] - diff Question 3 ========== I have to divide my variable time in equal time intervals. For example, i have observations at 14:41:38 and 15:28:32 and I would like to have observations at precise time intervals, for example each 5-minute, i.e. at 14:40:00, 14:45:00 and so on. Any idea on how to do this? Answer 3 ======== Create another dataset with regularly spaced observations and then -merge-. * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Help with data management***From:*Katia Bobulova <katia.bobulova@googlemail.com>

- Prev by Date:
**Re: st: RE: Creating a Non-Self Mean (Score)** - Next by Date:
**RE: st: RE: Creating a Non-Self Mean (Score)** - Previous by thread:
**st: Help with data management** - Next by thread:
**st: RE: Help with data management** - Index(es):

© Copyright 1996–2015 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |