Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: repeat same commands over hundreds of files


From   Eric Booth <[email protected]>
To   "<[email protected]>" <[email protected]>
Subject   Re: st: repeat same commands over hundreds of files
Date   Tue, 2 Nov 2010 23:52:12 +0000

<>

ugh....there were a couple of obvious mistakes in my previous post that are corrected in the code below (I'm sure there are others), the most important of which was that the -macro shift- was in the wrong loop. 

Also, after thinking about this a bit more, it seems like you might be preparing this data to go into one file.   If you plan to combine (probably append) these in any way, you can take advantage of the loop to do this by creating a "master" file and adding an -append- statement into the loop (see notes below): 


*********************!
global sf   "/Users/tbrunell/MPG//"


*!* added for an append *!*
clear
sa "$sf/MASTER.dta", replace emptyok
****


**grab all folders**
global folders:  dir  "$sf" dirs "*", respectcase
di `"$folders'"'   // these should be all your state subfolders


foreach f  of global folders  {
       di "Folder: `f'"

**grab all files in each folder**
global files: dir `"$sf/`f'"' files "*.csv", respectcase
di in green `"$files"'   //make sure this worked


**filter out the file extension so that we can save it as .dta**
global files: subinstr global files  ".csv" "", all
di in yellow `"$files"'   // make sure this worked


token `"$files"'

while `"`1'"' != "" {

cap confirm file  "$sf//`f'//`1'.csv"
  if !_rc {

clear
insheet using ""$sf//`f'//`1'.csv"
drop in L /*this drops file notation at the bottom*/
compress
gen demper=dem/(dem+rep)
gen demwin=.
replace demwin=1 if demper>.5 & demper~=.
replace demwin=0 if demper<.5
sort rkey
gen overalldemper=overalldem/(overalldem+overallrep)
collapse (count) numberofseats=demper (sum) demwin (mean) year demper overalldemper (p50) median=demper,by(rkey)
gen percentdemdist=demwin/numberofseats

*!*  added for the append  *!*
g state = "`f'"

*!*  add the year to dataset, extracted from filename you described*!*
g filename = "`1'"
g year = regexs(0) if  regexm(filename, "[\_][0-9][0-9][0-9][0-9][\_]")
replace year =subinstr(year, "_", "", .)
****
sa "$sf//`f'//`1'.dta", replace



*!*added next lines for the append to the MASTER*!*
u "$sf/MASTER.dta", clear
append using "$sf//`f'//`1'.dta"
sa "$sf/MASTER.dta", replace
*****

}

else {

/*
note:  the -confirm- if /else loop isn't  really necessary when
using this approach (its better applied when using
the forvalues loop approach I described earlier), but I left it in and it helped
me diagnose when I was missing a `/f' in one of my paths, so
I left  it here since it doesnt get in the way -- you can take it out
if you don't want it
*/

*!**  fixed the next line -->
di "file for $sf/`f' doesnt exist!"
             }
*!**   moved -mac shift- inside the -while- loop   
mac shift
   }

******* mac shift

}


*!** new **!*
u "$sf/MASTER.dta", clear
desc, sh
ta state
ta year
l filename year state  //-->  check this 
*********************!


- Eric
__
Eric A. Booth
Public Policy Research Institute
Texas A&M University
[email protected]



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index