Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: RE: repeat same commands over hundreds of files

From   Nick Cox <>
To   "" <>
Subject   RE: st: RE: repeat same commands over hundreds of files
Date   Tue, 2 Nov 2010 20:24:46 +0000

I think it's clear in principle that it can work, but absent detail on the structure of your folders and filenames, it is difficult to give concrete guidance. 

If the files are all in one directory with a similar naming structure it is easiest. 

There is a trade-off between upstream work moving and/or renaming files within the OS to a very simple structure and downstream work looping over folders and/or filenames in Stata. That will depend on your relative fluency in Unix (?) and Stata. 



Or should I say, perhaps it can, but I am not sure how.

On Nov 2, 2010, at 3:02 PM, Nick Cox wrote:

> Why not? 
> "But that doesn't solve my filename and output file problem."
> I am doing some simple analysis on election data that spans all the states and several decades.
> So I have hundreds of files that I want to do the same relatively simple analysis on (I have an example below).
> At first I started writing .do files for each state/year and the only things I changed were the 
> 1) file name for the insheet command
> 2) the name and location of the collapsed file at the end.
> However, when I wanted to add an additional command this meant opening hundreds of separate .do files, making a change, resaving the file.  It is not the end of the world, but I would prefer to set up the commands and then, somehow, tell stata to run the commands separately for each specified file and then save the resulting file with some new name.
> The techs at Stata recommended using macros for file names and the foreach command.  But that doesn't solve my filename and output file problem.
> Any recommendations would be much appreciated.
> Tom Brunell
> Professor of Political Science
> University of Texas at Dallas
> _____________________________
> clear
> insheet using "/Users/tbrunell/MPG/CT/mpg_09_CTC1972_1972_EDCD11_10_JH22.csv"
> drop in L /*this drops file notation at the bottom*/
> compress
> gen demper=dem/(dem+rep)
> gen demwin=.
> replace demwin=1 if demper>.5 & demper~=.
> replace demwin=0 if demper<.5
> sort rkey
> gen overalldemper=overalldem/(overalldem+overallrep)
> *here overalldemper will be total votes percentage, demper is "normalized" vote - averaged across districts
> collapse (count) numberofseats=demper (sum) demwin (mean) year demper overalldemper (p50) median=demper,by(rkey)
> gen percentdemdist=demwin/numberofseats
> save "/Users/tbrunell//MPG/CT/CTC1970s", replace
> *

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index