Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down at the end of May, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Pulling in files and data stored in a folder tree


From   "Ben Hoen" <bhoen@lbl.gov>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: Pulling in files and data stored in a folder tree
Date   Fri, 27 Jul 2012 11:26:31 -0400

Hi Statalisters,

I have a set of ~ 200,000 records stored in one dataset (?master file?) each
of which has a year and a county to which it applies, and a unique record
id.  Separately I have a large set of files that are stored by county (of
which there are 20, so there are 20 county folders) and year (for each
county there are 10 year folders ? 2002 through 2011).  In each year folder,
there are 4 files that I want to pull data from (via 1:1 merge with the
?master file? using the record id).  There are roughly 10 variables I want
to add to the master file from these 4 files, or approximately 2 to 3 from
each file.

So, the question is how I might write code that will go through each record
in the master file, determine the year and the county, go through the folder
tree to find the appropriate year in the appropriate county, and then merge
with the four files ?keeping? the data from the 10 variables?

A few things to note:  1) the files I want to pull data from are column
separated text files (i.e., I have not gone through the trouble of
converting then to Stata files yet ? but could?); and, 2) all of the files
from which I want to pull data are named by county and year (e.g.,
<countyname>_<year>_<filename>) and these names match exactly with the
county names and years stored in the master file.  

I suspect many have done this type of thing before, so if anyone has some
reading that they could send me to, I would be very appreciative.

Thanks, in advance,

Ben

Ben Hoen
Principal Research Associate
Lawrence Berkeley National Laboratory
Office: 845-758-1896
Cell: 718-812-7589
bhoen@lbl.gov
http://eetd.lbl.gov/ea/emp/staff/hoen.html

Visit our publications at: 
http://eetd.lbl.gov/ea/ems/emp-pubs.html

Sign up for our email list to receive publication notifications:
http://eetd.lbl.gov/ea/emp/list/emp_pubs_signup.php




*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index