[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
RE: st: RE: Help with data management |

Date |
Tue, 15 Jul 2008 17:47:57 +0100 |

On 1) "didn't work" can mean about twenty different things. Always precisely say what you mean, e.g. error message, incorrect values, puzzling results, computer burst into flames. -egen, hms()- takes a varlist and varlists don't include commas. That is explicit in the help. -egen, hms()- works out the time as seconds after midnight, which is exactly what my -generate- statement also did. It thus beats me why you regard -egen, hms()- as what you want to do, but not my statement. You will clearly need to use the date (i.e. day) information as well. That step was not included in my advice. On 2) your missings are messing up the solution. Or, my code is broken by missings. bysort date time : egen min1 = min(price1) by date time: egen max2 = max(price2) gen diff = max2 - min1 Nick n.j.cox@durham.ac.uk Katia Bobulova I tried to curry out your instructions, however I have still some problems with question 1 and 2: 1)I have the time in this format: 63000 and i want to construct a "timedate" variable, which contains both the data and the time. You told me that hms works only with egen, so I tired to do this: g hours=int(time/10000) g minutes=int((time-hours*10000)/100) g seconds=int(time-hours*10000-minutes*100) egen newtime=hms(hours, minutes, seconds) format newtime%tc But it doesn't work and the solution that you suggested me (i.e. gen long time_in_sec = 3600 * hours + 60 * minutes + seconds) is not appropriate because it changes time in seconds. 2) I have two types of prices: price1 and price2. I would like to create a variable which takes the difference between the lowest value of price1 and the highest value of price2 for each data and time. You suggested me: bysort date time (price1) : gen diff = price1[1] bysort date time (price2) : replace diff = price2[_N] - diff However, after implementing your solution, I receive (for a specific time and data) this result: bidprice askprice diff 3. . 100.15 . 4. . 100.17 . 5. 100.09 . . 6. 100.1 . . What I would like to have as diff would be (100.15-100.1)=0.05 instead I have all missing values. 2008/7/14 n.j.cox@durham.ac.uk <n.j.cox@durham.ac.uk>: > Katia Bobulova asked three questions. My answers below. > > I have some problems in managing my dataset. I have a dataset with > high frequency data, with observations about time, date, price, > quantity and a code for the type of security. > > Question 1 > ========== > I have the time in this format: 63000 and i want to construct a > "timedate" variable, which contains both the data and the time. > > First af all I tried to modify the format of the time typing: > > g hours=int(time/10000) > g minutes=int((time-hours*10000)/100) > g seconds=int(time-hours*10000-minutes*100) > g newtime=hms(hours, minutes, seconds) > format newtime%tc > > However, after typing hms() I receive this message: > Unknown function hms() > r(133); > > The next step would be typing something like: > > gen double timedate=date*24*60*60*1000+time > format timedate %tcNN/DD/CCYY_HH:MM:SS > > Answer 1 > ======== > > -hms()- is an -egen- function written by Kit Baum and is included in the > -egenmore- package from SSC. It will _only_ work with -egen-, not with > -generate-. > > But you don't need it. You did almost all the work yourself. > > Assuming that e.g. 63000 is 06:30:00 then given your variables -hours-, > -minutes-, -seconds-, > > gen long time_in_sec = 3600 * hours + 60 * minutes + seconds > > Question 2 > ========== > > I have two types of prices: price1 and price2. I would like to > create a variable which takes the difference between the lowest value > of price1 and the highest value of price2 for each data and time. > > First of all I sorted my dataset: > > sort date time price1 price2 > > then I generated the new variable: > > gen price3=(price1[_n]-price2[_N] & date==(26feb2008) & time==63000 > > But as a result I have all missing values, furthermore I have to do > this for each date and time, so i was wondering if there is a way to > instruct Stata to create this new variable for each time and data. > > Answer 2 > ======== > > I don't understand how you reached your solution, nor why it produces > missing values. > I don't think your -sort- order guarantees what you want, but it looks > quite wrong any way. > > This is likely to be closer to where you want to be. > > bysort date time (price1) : gen diff = price1[1] > bysort date time (price2) : replace diff = price2[_N] - diff > > > Question 3 > ========== > > I have to divide my variable time in equal time intervals. For > example, i have observations at 14:41:38 and 15:28:32 and I would like > to have observations at precise time intervals, for example each > 5-minute, i.e. at 14:40:00, 14:45:00 and so on. Any idea on how to do > this? > > Answer 3 > ======== > > Create another dataset with regularly spaced observations and then > -merge-. * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**References**:**st: Help with data management***From:*Katia Bobulova <katia.bobulova@googlemail.com>

**st: RE: Help with data management***From:*"n.j.cox@durham.ac.uk" <n.j.cox@durham.ac.uk>

**Re: st: RE: Help with data management***From:*"Katia Bobulova" <katia.bobulova@googlemail.com>

- Prev by Date:
**Re: st: RE: Help with data management** - Next by Date:
**RE: st: RE: Help with data management** - Previous by thread:
**Re: st: RE: Help with data management** - Next by thread:
**RE: st: RE: Help with data management** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |