Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Breaking huge lines and creating variables


From   Pedro Nakashima <nakashimapedro@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Breaking huge lines and creating variables
Date   Fri, 30 Sep 2011 17:08:35 -0300

Ok, here's a sample line (remark: there are files in which there are
more than 10 million lines):

8=FIX.4.49=1015735=X34=4334449=FIXGatewayDerivatives52=20110126-11:00:00.20656=MD01A10016=31418_DOLG11_181_7075=20110126268=83279=0269=2278=21355=DOLG1148=BMFBR861885122=8207=XBMF270=1670271=300272=20110126273=11:00:00274=2288=BM000735289=BM000150451=-4.1626032=1279=2269=0278=21455=DOLG1148=BMFBR861885122=8207=XBMF272=20110126273=11:00:0037=000010288=BM000735290=14279=0269=2278=21555=DOLG1148=BMFBR861885122=8207=XBMF270=1670271=200272=20110126273=11:00:00274=3288=BM000127289=BM000150451=-4.1626032=2279=2269=0278=21655=DOLG1148=BMFBR861885122=8207=XBMF272=20110126273=11:00:0037=000002288=BM000127290=9279=2269=1278=21755=DOLG1148=BMFBR861885122=8207=XBMF272=20110126273=11:00:0037=000013289=BM000150290=12279=0269=2278=21855=DOLG1148=BMFBR861885122=8207=XBMF270=1670271=75272=20110126273=11:00:00274=3288=BM000227289=BM000119451=-4.1626032=3279=2269=0278=21955=DOLG1148=BMFBR861885122!
 =8207=XBMF272=20110126273=11:00:

My database regards to fx market and was provided by the brazilian
futures and commodities exchange. (it's a intraday db)

268=83 indicates that 83 entries follow and the beginning of each
entry is indicated by the code 279


2011/9/30 Nick Cox <njcoxstata@gmail.com>:
> Please don't show an abstraction. Show us a concrete example.
>
> It all depends on the length of the lines, which you don't give. If
> this can be read into one or more string variables, Mata may not be
> necessary.
>
> Nick
>
> Nick
>
> On Fri, Sep 30, 2011 at 3:59 PM, Pedro Nakashima
> <nakashimapedro@gmail.com> wrote:
>> Dear statalisters,
>>
>> My current database(it's in .txt) has a typical line in the form :
>>
>> cod1=time1   cod2=...  cod3=n   cod4=type1   cod5=quantity1
>> cod6=price1   cod4=type2   cod5=quantity2   cod6=price2
>>
>> cod3=n (n=2 in this case, and n varies through lines) says that
>> following this code, that are 2 records, each of them starts with
>> cod4=type. For example:
>> cod4=type1   cod5=quantity1   cod6=price1         and
>> cod4=type2   cod5=quantity2   cod6=price2          are records.
>>
>> I want to generate a new database in the form:
>>
>> cod1     cod4      cod5         cod6
>> time1    type1    quantity1    price1
>> time1    type2    quantity2    price2
>>
>> I think it's only possible to do that with Mata given the lenghs of
>> the lines, is it right?
>>
>> Can anyone give me a direction?
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index