[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Help with data management

From   "" <>
To   "" <>
Subject   st: RE: Help with data management
Date   Mon, 14 Jul 2008 07:16:53 -0700


Katia Bobulova asked three questions. My answers below.


I have some problems in managing my dataset. I have a dataset with
high frequency data, with observations about time, date, price,
quantity and a code for the type of security.

Question 1
I have the time in this format: 63000 and i want to construct a
"timedate" variable, which contains both the data and the time.

First af all I tried to modify the format of the time typing:

g hours=int(time/10000)
g minutes=int((time-hours*10000)/100)
g seconds=int(time-hours*10000-minutes*100)
g newtime=hms(hours, minutes, seconds)
format newtime%tc

However, after typing hms() I receive this message:
Unknown function hms()

The next step would be typing something like:

gen double timedate=date*24*60*60*1000+time
format timedate %tcNN/DD/CCYY_HH:MM:SS

Answer 1

-hms()- is an -egen- function written by Kit Baum and is included in the
-egenmore- package from SSC. It will _only_ work with -egen-, not with

But you don't need it. You did almost all the work yourself.

Assuming that e.g. 63000 is 06:30:00 then given your variables -hours-,
-minutes-, -seconds-,

gen long time_in_sec = 3600 * hours + 60 * minutes + seconds

Question 2

I have two types of prices: price1 and price2. I would like to
create a variable which takes the difference between the lowest value
of price1 and the highest value of price2 for each data and time.

First of all I sorted my dataset:

sort date time price1 price2

then I generated the new variable:

gen price3=(price1[_n]-price2[_N] & date==(26feb2008) & time==63000

But as a result I have all missing values, furthermore I have  to do
this for each date and time, so i was wondering if there is a way to
instruct Stata to create this new variable for each time and data.

Answer 2

I don't understand how you reached your solution, nor why it produces
missing values.
I don't think your -sort- order guarantees what you want, but it looks
quite wrong any way.

This is likely to be closer to where you want to be.

bysort date time (price1) : gen diff = price1[1]
bysort date time (price2) : replace diff = price2[_N] - diff

Question 3

I have to divide my variable time in equal time intervals. For
example, i have observations at 14:41:38 and 15:28:32 and I would like
to have observations at precise time intervals, for example each
5-minute, i.e. at 14:40:00, 14:45:00 and so on. Any idea on how to do

Answer 3

Create another dataset with regularly spaced observations and then

*   For searches and help try:

Privileged, confidential or patient identifiable information may be contained in this message. This information is meant only for the use of the intended recipients. If you are not the intended recipient, or if the message has been addressed to you in error, do not read, disclose, reproduce, distribute, disseminate or otherwise use this transmission. Instead, please notify the sender by reply e-mail, and then destroy all copies of the message and any attachments.

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index