Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Stata - efficiently appending 200+ files (my method takes hours)


From   Sunita Surana <surana@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   st: Stata - efficiently appending 200+ files (my method takes hours)
Date   Sat, 21 Dec 2013 13:55:13 -0500

I am trying to append approx. 200 files using Stata. Below I have
provided the code I am using to append. The issue is that it is taking
too long -- over 5 hours to do. The ultimate appended file has over 28
million observations and is about 2GB in size. I think the issue might
be that it is saving every time and hence takes too long. I also tried
using the tempfile mode -- but that also takes long. My colleague, on
the other hand, did the same append in minutes using SAS. I have
provided his code below as well. I would very much appreciate if
someone could show me how to do it efficiently in Stata -- so that it
would not take hours. Thanks much!

My Stata code:

file close _all
    file open myfile using "$OP\filelist_test.txt", read
    file read myfile line

    cd "$OP"
    insheet using "`line'", comma clear
    tostring optionconditioncode, replace

    save "$data\options_all", replace

    file read myfile line

    while r(eof)==0{
        insheet using "`line'", comma clear
        tostring optionconditioncode, replace
        append using "$data\options_all"
        save "$data\options_all", replace

        file read myfile line
        }

    file close myfile

*******

My colleague's SAS code:

data all_text (drop=fname);
      length myfilename $100;
      set dirlist;
      filepath = "&dirname\"||fname;
      infile dummy filevar = filepath length=reclen end=done missover
dlm=',' firstobs=2 dsd;
      do while(not done);
        myfilename = filepath;
        input var1
                    var2
                    var3
                    var4
          output;
      end;
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index