Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: Help with data management


From   "Katia Bobulova" <katia.bobulova@googlemail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: RE: Help with data management
Date   Tue, 15 Jul 2008 17:23:43 +0100

Dear Nick,

thank you very much for your reply.
I tried to curry out your instructions, however I have still some
problems with question 1 and 2:

1)I have the time in this format: 63000 and i want to construct a
"timedate" variable, which contains both the data and the time.

You told me that hms works only with egen, so I tired to do this:

g hours=int(time/10000)
g minutes=int((time-hours*10000)/100)
g seconds=int(time-hours*10000-minutes*100)
egen newtime=hms(hours, minutes, seconds)
format newtime%tc

But it doesn't work and the solution that you suggested me (i.e. gen
long time_in_sec = 3600 * hours + 60 * minutes + seconds) is not
appropriate because it changes time in seconds.

2) I have two types of prices: price1 and price2. I would like to
create a variable which takes the difference between the lowest value
 of price1 and the highest value of price2 for each data and time.

You suggested me:

bysort date time (price1) : gen diff = price1[1]
bysort date time (price2) : replace diff = price2[_N] - diff

However, after implementing your solution, I receive (for a specific
time and data) this result:


bidprice      askprice diff
3.         .     100.15     .
4.         .     100.17     .
5.    100.09      .         .
6.     100.1       .         .

What I would like to have as diff would be (100.15-100.1)=0.05

instead I have all missing values.

Could you please help me to solve these problems?

Thank you very much.

Best,
Katia

2008/7/14 n.j.cox@durham.ac.uk <n.j.cox@durham.ac.uk>:
> .
>
> Katia Bobulova asked three questions. My answers below.
>
> Nick
> n.j.cox@durham.ac.uk
>
> I have some problems in managing my dataset. I have a dataset with
> high frequency data, with observations about time, date, price,
> quantity and a code for the type of security.
>
> Question 1
> ==========
> I have the time in this format: 63000 and i want to construct a
> "timedate" variable, which contains both the data and the time.
>
> First af all I tried to modify the format of the time typing:
>
> g hours=int(time/10000)
> g minutes=int((time-hours*10000)/100)
> g seconds=int(time-hours*10000-minutes*100)
> g newtime=hms(hours, minutes, seconds)
> format newtime%tc
>
> However, after typing hms() I receive this message:
> Unknown function hms()
> r(133);
>
> The next step would be typing something like:
>
> gen double timedate=date*24*60*60*1000+time
> format timedate %tcNN/DD/CCYY_HH:MM:SS
>
> Answer 1
> ========
>
> -hms()- is an -egen- function written by Kit Baum and is included in the
> -egenmore- package from SSC. It will _only_ work with -egen-, not with
> -generate-.
>
> But you don't need it. You did almost all the work yourself.
>
> Assuming that e.g. 63000 is 06:30:00 then given your variables -hours-,
> -minutes-, -seconds-,
>
> gen long time_in_sec = 3600 * hours + 60 * minutes + seconds
>
> Question 2
> ==========
>
> I have two types of prices: price1 and price2. I would like to
> create a variable which takes the difference between the lowest value
> of price1 and the highest value of price2 for each data and time.
>
> First of all I sorted my dataset:
>
> sort date time price1 price2
>
> then I generated the new variable:
>
> gen price3=(price1[_n]-price2[_N] & date==(26feb2008) & time==63000
>
> But as a result I have all missing values, furthermore I have  to do
> this for each date and time, so i was wondering if there is a way to
> instruct Stata to create this new variable for each time and data.
>
> Answer 2
> ========
>
> I don't understand how you reached your solution, nor why it produces
> missing values.
> I don't think your -sort- order guarantees what you want, but it looks
> quite wrong any way.
>
> This is likely to be closer to where you want to be.
>
> bysort date time (price1) : gen diff = price1[1]
> bysort date time (price2) : replace diff = price2[_N] - diff
>
>
> Question 3
> ==========
>
> I have to divide my variable time in equal time intervals. For
> example, i have observations at 14:41:38 and 15:28:32 and I would like
> to have observations at precise time intervals, for example each
> 5-minute, i.e. at 14:40:00, 14:45:00 and so on. Any idea on how to do
> this?
>
> Answer 3
> ========
>
> Create another dataset with regularly spaced observations and then
> -merge-.
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
> Privileged, confidential or patient identifiable information may be contained in this message. This information is meant only for the use of the intended recipients. If you are not the intended recipient, or if the message has been addressed to you in error, do not read, disclose, reproduce, distribute, disseminate or otherwise use this transmission. Instead, please notify the sender by reply e-mail, and then destroy all copies of the message and any attachments.
>
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index