Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Breaking huge lines and creating variables


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Breaking huge lines and creating variables
Date   Fri, 30 Sep 2011 23:53:58 +0100

Does look like a Mata job.

But it seems unlikely that you are the first person to work with such
data, so asking what formats people use sounds best.

Nick

On Fri, Sep 30, 2011 at 9:08 PM, Pedro Nakashima
<nakashimapedro@gmail.com> wrote:
> Ok, here's a sample line (remark: there are files in which there are
> more than 10 million lines):
>
> 8=FIX.4.4 9=10157 35=X 34=43344 49=FIXGatewayDerivatives 52=20110126-11:00:00.206 56=MD01A 10016=31418_DOLG11_181_70 75=20110126 268=83 279=0 269=2 278=213 55=DOLG11 48=BMFBR8618851 22=8 207=XBMF 270=1670 271=300 272=20110126 273=11:00:00 274=2 288=BM000735 289=BM000150 451=-4.162 6032=1 279=2 269=0 278=214 55=DOLG11 48=BMFBR8618851 22=8 207=XBMF 272=20110126 273=11:00:00 37=000010 288=BM000735 290=14 279=0 269=2 278=215 55=DOLG11 48=BMFBR8618851 22=8 207=XBMF 270=1670 271=200 272=20110126 273=11:00:00 274=3 288=BM000127 289=BM000150 451=-4.162 6032=2 279=2 269=0 278=216 55=DOLG11 48=BMFBR8618851 22=8 207=XBMF 272=20110126 273=11:00:00 37=000002 288=BM000127 290=9 279=2 269=1 278=217 55=DOLG11 48=BMFBR8618851 22=8 207=XBMF 272=20110126 273=11:00:00 37=000013 289=BM000150 290=12 279=0 269=2 278=218 55=DOLG11 48=BMFBR8618851 22=8 207=XBMF 270=1670 271=75 272=20110126 273=11:00:00 274=3 288=BM000227 289=BM000119 451=-4.162 6032=3 279=2 269=0 278=219 55=DOLG11 48=BMFBR8618851 !
 22!
>  =8 207=XBMF 272=20110126 273=11:00:
>
> My database regards to fx market and was provided by the brazilian
> futures and commodities exchange. (it's a intraday db)
>
> 268=83 indicates that 83 entries follow and the beginning of each
> entry is indicated by the code 279
>
>
> 2011/9/30 Nick Cox <njcoxstata@gmail.com>:
>> Please don't show an abstraction. Show us a concrete example.
>>
>> It all depends on the length of the lines, which you don't give. If
>> this can be read into one or more string variables, Mata may not be
>> necessary.
>>
>> Nick
>>
>> Nick
>>
>> On Fri, Sep 30, 2011 at 3:59 PM, Pedro Nakashima
>> <nakashimapedro@gmail.com> wrote:
>>> Dear statalisters,
>>>
>>> My current database(it's in .txt) has a typical line in the form :
>>>
>>> cod1=time1   cod2=...  cod3=n   cod4=type1   cod5=quantity1
>>> cod6=price1   cod4=type2   cod5=quantity2   cod6=price2
>>>
>>> cod3=n (n=2 in this case, and n varies through lines) says that
>>> following this code, that are 2 records, each of them starts with
>>> cod4=type. For example:
>>> cod4=type1   cod5=quantity1   cod6=price1         and
>>> cod4=type2   cod5=quantity2   cod6=price2          are records.
>>>
>>> I want to generate a new database in the form:
>>>
>>> cod1     cod4      cod5         cod6
>>> time1    type1    quantity1    price1
>>> time1    type2    quantity2    price2
>>>
>>> I think it's only possible to do that with Mata given the lenghs of
>>> the lines, is it right?
>>>
>>> Can anyone give me a direction?
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/statalist/faq
>> *   http://www.ats.ucla.edu/stat/stata/
>>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index