Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: repeat do-file over multiple files


From   Daniel Bela <daniel.bela@uni-bamberg.de>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   Re: st: repeat do-file over multiple files
Date   Sun, 4 Mar 2012 13:48:20 +0100

Dear Allan,

> The same do-file should be run on all 256 files and saved as 256 new
> files.
> And lastly the 256 new files should be combined using append to one
> file.

from my point of view, this seems to be quite cumbersome. In addition to
earlier suggestions, you could achieve this preserving the sort order of
observations the following way:

---- begin statacode ----
    clear
    cd "/Users/Stata/ProjectX/"
    local filelist: dir "." files "*.dta", respectcase
    local num=0
    local appendlist
    /* work on every file and temp-save it */
    foreach file of local filelist {
        use "`file'"
        tempfile file`++num'
        display as text in smcl  "working on file number {it:`num'}..."
        /* you could also -do- an external do-file here; note that this
do file should not -use- or -save- anything, this already happened!
        collapse <...>
        */
        display as text in smcl  "... finished working on file number
{it:`num'}"
        save `file`num''
    }
    /* concatenate files */
    forvalues filenum=1/`num' {
        if `filenum'==1 use `file`filenum''
        else local appendlist: list appendlist | file`filenum'
    }
    append using `appendlist'
---- end statacode ----

My main point is: It would be more straightforward if you concatenated
the files in the first place, and afterwards did your data preparation;
for example:

---- begin statacode ----
    clear
    cd "/Users/Stata/ProjectX/"
    local filelist: dir "." files "*.dta", respectcase
    /* concatenate files */
    local firstfile=`"""'+"`: word 1 of `filelist''"+`"""'
    local otherfiles: list filelist - firstfile
    use `firstfile'
    append using `otherfiles', generate(source)
    /* you now have a variable "source" identifying groups of
observations from each file;
        work on every generated group instead of single files;
        note that most other data preparation commands support the
-bysort- prefix, doing the same as by() for collapse
    */
    /* you could also -do- an external do-file here; note that this has
to perform every operation by(`source')
    collapse <...>, by(source)
    */
    drop source
---- end statacode ---

Regards

Bela

-- 
Daniel Bela
National Educational Panel Study (NEPS)
Data Center

postal address:
Otto-Friedrich-University Bamberg, NEPS
96045 Bamberg
GERMANY

visitor's address:
Otto-Friedrich-Universität Bamberg, NEPS
Wilhelmsplatz 3, Room 112, 96047 Bamberg

phone:     +49 951 8633428
facsimile: +49 951 8633405
website: http://www.neps-data.de/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index