Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Tim Evans <Tim.Evans@phe.gov.uk> |
To | "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |
Subject | RE: st: RE: recursively search folder sub directories and store filenames in a text file |
Date | Thu, 31 Oct 2013 09:46:07 +0000 |
Robert, Thanks for all of your help. I eventually went down the route of saving the results in a log-file and then reading in the files and the code I used is below. I did try to take advantage of the datafile you helped create in your second suggestion, but I couldn't overcome the fact that I loaded the file to access the values (filenames) but at the same time having an empty dataset. --BEGIN CODE-- clear all cd "T:\Final" cap program drop dirlist program define dirlist syntax, fromdir(string) // list of all files in "`fromdir'" local flist: dir "`fromdir'" files "*.csv" foreach f of local flist { dis "`fromdir'/`f'" } // recursively list directories in "`fromdir'" local dlist: dir "`fromdir'" dirs "*" foreach d of local dlist { dirlist , fromdir("`fromdir'/`d'") `list' } end log using filenames.log, replace local cdir = "`c(pwd)'" dirlist, fromdir("`cdir'") log close insheet using filenames.log keep if regexm(v1, "^T") == 1 ///Clean log file of any rows not associated with a filename and path rename v1 filename outsheet using "T:\Final\final_txt.txt", nonames replace clear all file open myfile using "T:\Final\final_txt.txt", read file read myfile line insheet using `line', comma names di as text `line' save master_data, replace clear file read myfile line while r(eof)==0 { insheet using `line' di as text `line' save temp, replace append using master_data, force save master_data, replace **save temp, replace clear file read myfile line } append using master_data outsheet using "T:\Final\combined_data.csv", comma names replace --END CODE-- Best wishes Tim -----Original Message----- From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Robert Picard Sent: 30 October 2013 15:18 To: statalist@hsphsun2.harvard.edu Subject: Re: st: RE: recursively search folder sub directories and store filenames in a text file If this is a one shot deal, I would have simply copied the output from the results window to a text file and processed the list from there. Using a log file to capture the list is also simple. It does make sense however that a program that recursively lists files save the list to a dataset so here's a modified version that adds that capability. While I was at it, I added a -pattern()- option if you want to restrict the search. Robert * ----- begin example --------------------------- cap program drop dirlist program define dirlist syntax , fromdir(string) save(string) /// [pattern(string) replace append] // get files in "`fromdir'" using pattern if "`pattern'" == "" local pattern "*" local flist: dir "`fromdir'" files "`pattern'" qui { // initialize dataset to use if "`append'" != "" use "`save'", clear else { clear gen fname = "" } // add files to the dataset local i = _N foreach f of local flist { set obs `++i' replace fname = "`fromdir'/`f'" in `i' } save "`save'", `replace' } // recursively list directories in "`fromdir'" local dlist: dir "`fromdir'" dirs "*" foreach d of local dlist { dirlist , fromdir("`fromdir'/`d'") save(`save') /// pattern("`pattern'") append replace } end * start from the current directory local cdir = "`c(pwd)'" * list all files dirlist, fromdir("`cdir'") save("allfiles.dta") replace * list all Excel files dirlist, fromdir("`cdir'") save("dofiles.dta") /// pattern("*.xls") replace * ----- end example ----------------------------- On Wed, Oct 30, 2013 at 6:16 AM, Tim Evans <Tim.Evans@phe.gov.uk> wrote: > Robert, > > Thank you very much, this does indeed seem to do the trick - I am impressed! What I would like to do is save the files I list into either a .dta file, or to a text file which I can then read into Stata. The aim then will be to run through each record and open the file. > > My only suggestion I have at the moment would be to open a log file and save this, although this might not be the best way of doing things. Do you have any advice? > > Bes wishes > > Tim > > -----Original Message----- > From: owner-statalist@hsphsun2.harvard.edu > [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Robert > Picard > Sent: 29 October 2013 13:45 > To: statalist@hsphsun2.harvard.edu > Subject: Re: st: RE: recursively search folder sub directories and > store filenames in a text file > > Here is a way, from an initial directory, to recursively list all files in Stata. > > Robert > > * ----- begin example --------------------------- cap program drop > dirlist program define dirlist > > syntax , fromdir(string) > > // list of all files in "`fromdir'" > local flist: dir "`fromdir'" files "*" > foreach f of local flist { > dis "`fromdir'/`f'" > } > > // recursively list directories in "`fromdir'" > local dlist: dir "`fromdir'" dirs "*" > foreach d of local dlist { > dirlist , fromdir("`fromdir'/`d'") `list' > } > > end > > local cdir = "`c(pwd)'" > dirlist, fromdir("`cdir'") > > * ----- end example ----------------------------- > > On Tue, Oct 29, 2013 at 8:04 AM, Tim Evans <Tim.Evans@phe.gov.uk> wrote: >> Hi all, >> >> I am using Stata 11.2 and have a working directory called "T:\Projects\Final". In this folder I have a number of subfolders i.e. GEH_2013, SWB_2013 and within these I have for example GEH_COL and GEH_OGD. Within these folders I have a csv file. >> >> So folder structure looks like : >> >> T:\Projects\Final >> T:\Projects\Final\GEH_2013 >> T:\Projects\Final\GEH_2013\GEH_COL >> T:\Projects\Final\GEH_2013\GEH_COL\ GEH_COL_combined.csv >> T:\Projects\Final\GEH_2013\GEH_OGD >> T:\Projects\Final\GEH_2013\GEH_OGD\ GEH_OGD_combined.csv >> T:\Projects\Final\SWB_2013 >> T:\Projects\Final\SWB_2013\SWB_COL >> T:\Projects\Final\SWB_2013\SWB_COL\SWB_COL_combined.csv >> T:\Projects\Final\SWB_2013\SWB_OGD >> T:\Projects\Final\SWB_2013\SWB_OGD\SWB_OGD_combined.csv >> >> >> What I am trying to do is ultimately identify the names of each csv file contained at the third level of sub-directory and append the csv files into one large file. >> >> I have taken a look at using the following: >> >> rcd, :! dir *.csv /a-d /b >filelist.txt >> >> but all this does is create a text file in each sub-directory with the name of the csv file in that directory - so for T:\Projects\Final I have an empty text file as no csv files here, but what I need is a single text file that contains the filename and path for each csv file contained within T:\Projects\Final. >> >> Once I have this, my aim is to use the filenames and paths stored in the text file and to combine each csv file into one file. >> >> If anyone has a more elegant method of appending all csv files that are stored within sub-directories of a folder then I'd be grateful to hear! >> >> Best wishes >> >> Tim >> >> ********************************************************************* >> * >> **** The information contained in the EMail and any attachments is >> confidential and intended solely and for the attention and use of the >> named addressee(s). It may not be disclosed to any other person >> without the express authority of Public Health England, or the >> intended recipient, or both. If you are not the intended recipient, >> you must not disclose, copy, distribute or retain this message or any >> part of it. This footnote also confirms that this EMail has been >> swept for computer viruses by Symantec.Cloud, but please re-sweep any >> attachments before opening or saving. http://www.gov.uk/PHE >> ********************************************************************* >> * >> **** >> >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> * http://www.ats.ucla.edu/stat/stata/ > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ > > ********************************************************************** > **** The information contained in the EMail and any attachments is > confidential and intended solely and for the attention and use of the > named addressee(s). It may not be disclosed to any other person > without the express authority of Public Health England, or the > intended recipient, or both. If you are not the intended recipient, > you must not disclose, copy, distribute or retain this message or any > part of it. This footnote also confirms that this EMail has been swept > for computer viruses by Symantec.Cloud, but please re-sweep any > attachments before opening or saving. http://www.gov.uk/PHE > ********************************************************************** > **** > > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/ ************************************************************************** The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE ************************************************************************** * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/