Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: RE: recursively search folder sub directories and store filenames in a text file


From   Tim Evans <Tim.Evans@phe.gov.uk>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   RE: st: RE: recursively search folder sub directories and store filenames in a text file
Date   Thu, 31 Oct 2013 10:00:06 +0000

What I meant to finish off below was that I couldn't resolve the fact that I needed to load the data file created in the first part of the routine to access the data (filenames) for which I wanted to combine into one dataset, while simultaneously needing an empty datasheet in order to use -insheet- to read in each file of interest and append into one large dataset - I couldn't work this out, so stopped.

 
-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Tim Evans
Sent: 31 October 2013 09:46
To: statalist@hsphsun2.harvard.edu
Subject: RE: st: RE: recursively search folder sub directories and store filenames in a text file

Robert,

Thanks for all of your help. I eventually went down the route of saving the results in a log-file and then reading in the files and the code I used is below. I did try to take advantage of the datafile you helped create in your second suggestion, but I couldn't overcome the fact that I loaded the file to access the values (filenames) but at the same time having an empty dataset.

--BEGIN CODE--

clear all

cd "T:\Final"

cap program drop dirlist
program define dirlist

   syntax, fromdir(string)

   // list of all files in "`fromdir'"
   local flist: dir "`fromdir'" files "*.csv"
   foreach f of local flist {
      dis "`fromdir'/`f'"
   }

   // recursively list directories in "`fromdir'"
   local dlist: dir "`fromdir'" dirs "*"
   foreach d of local dlist {
      dirlist , fromdir("`fromdir'/`d'") `list'
   }

end

log using filenames.log, replace

local cdir = "`c(pwd)'"
dirlist, fromdir("`cdir'")

log close

insheet using filenames.log
keep if  regexm(v1, "^T") == 1  ///Clean log file of any rows not associated with a filename and path rename v1 filename

outsheet using "T:\Final\final_txt.txt", nonames replace

clear all

file open myfile using "T:\Final\final_txt.txt", read file read myfile line

insheet using `line', comma names
di as text `line'
save master_data, replace
clear
file read myfile line
while r(eof)==0 {
	insheet using `line'
	di as text `line'
	save temp, replace
	append using master_data, force
	save master_data, replace
	**save temp, replace
	clear
		file read myfile line
}
append using master_data

outsheet using "T:\Final\combined_data.csv", comma names replace

--END CODE--

Best wishes

Tim

-----Original Message-----
From: owner-statalist@hsphsun2.harvard.edu [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Robert Picard
Sent: 30 October 2013 15:18
To: statalist@hsphsun2.harvard.edu
Subject: Re: st: RE: recursively search folder sub directories and store filenames in a text file

If this is a one shot deal, I would have simply copied the output from the results window to a text file and processed the list from there.
Using a log file to capture the list is also simple. It does make sense however that a program that recursively lists files save the list to a dataset so here's a modified version that adds that capability. While I was at it, I added a -pattern()- option if you want to restrict the search.

Robert

* ----- begin example --------------------------- cap program drop dirlist program define dirlist

  syntax , fromdir(string) save(string) ///
    [pattern(string) replace append]

  // get files in "`fromdir'" using pattern
  if "`pattern'" == "" local pattern "*"
  local flist: dir "`fromdir'" files "`pattern'"

  qui {

    // initialize dataset to use
    if "`append'" != "" use "`save'", clear
    else {
      clear
      gen fname = ""
    }

    // add files to the dataset
    local i = _N
    foreach f of local flist {
      set obs `++i'
      replace fname = "`fromdir'/`f'" in `i'
    }
    save "`save'", `replace'

  }

  // recursively list directories in "`fromdir'"
  local dlist: dir "`fromdir'" dirs "*"
  foreach d of local dlist {
    dirlist , fromdir("`fromdir'/`d'") save(`save') ///
    pattern("`pattern'") append replace
  }

end

* start from the current directory
local cdir = "`c(pwd)'"

* list all files
dirlist, fromdir("`cdir'") save("allfiles.dta") replace

* list all Excel files
dirlist, fromdir("`cdir'") save("dofiles.dta") ///
  pattern("*.xls") replace

* ----- end example -----------------------------

On Wed, Oct 30, 2013 at 6:16 AM, Tim Evans <Tim.Evans@phe.gov.uk> wrote:
> Robert,
>
> Thank you very much, this does indeed seem to do the trick - I am impressed! What I would like to do is save the files I list into either a .dta file, or to a text file which I can then read into Stata. The aim then will be to run through each record and open the file.
>
> My only suggestion I have at the moment would be to open a log file and save this, although this might not be the best way of doing things. Do you have any advice?
>
> Bes wishes
>
> Tim
>
> -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu
> [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of Robert 
> Picard
> Sent: 29 October 2013 13:45
> To: statalist@hsphsun2.harvard.edu
> Subject: Re: st: RE: recursively search folder sub directories and 
> store filenames in a text file
>
> Here is a way, from an initial directory, to recursively list all files in Stata.
>
> Robert
>
> * ----- begin example --------------------------- cap program drop 
> dirlist program define dirlist
>
>    syntax , fromdir(string)
>
>    // list of all files in "`fromdir'"
>    local flist: dir "`fromdir'" files "*"
>    foreach f of local flist {
>       dis "`fromdir'/`f'"
>    }
>
>    // recursively list directories in "`fromdir'"
>    local dlist: dir "`fromdir'" dirs "*"
>    foreach d of local dlist {
>       dirlist , fromdir("`fromdir'/`d'") `list'
>    }
>
> end
>
> local cdir = "`c(pwd)'"
> dirlist, fromdir("`cdir'")
>
> * ----- end example -----------------------------
>
> On Tue, Oct 29, 2013 at 8:04 AM, Tim Evans <Tim.Evans@phe.gov.uk> wrote:
>> Hi all,
>>
>> I am using Stata 11.2 and have a working directory called "T:\Projects\Final". In this folder I have a number of subfolders i.e. GEH_2013, SWB_2013 and within these I have for example GEH_COL and GEH_OGD. Within these folders I have a csv file.
>>
>> So folder structure looks like :
>>
>> T:\Projects\Final
>> T:\Projects\Final\GEH_2013
>> T:\Projects\Final\GEH_2013\GEH_COL
>> T:\Projects\Final\GEH_2013\GEH_COL\ GEH_COL_combined.csv 
>> T:\Projects\Final\GEH_2013\GEH_OGD
>> T:\Projects\Final\GEH_2013\GEH_OGD\ GEH_OGD_combined.csv
>> T:\Projects\Final\SWB_2013
>> T:\Projects\Final\SWB_2013\SWB_COL
>> T:\Projects\Final\SWB_2013\SWB_COL\SWB_COL_combined.csv
>> T:\Projects\Final\SWB_2013\SWB_OGD
>> T:\Projects\Final\SWB_2013\SWB_OGD\SWB_OGD_combined.csv
>>
>>
>> What I am trying to do is ultimately identify the names of each csv file contained at the third level of sub-directory and append the csv files into one large file.
>>
>> I have taken a look at using the following:
>>
>> rcd, :! dir *.csv /a-d /b >filelist.txt
>>
>> but all this does is create a text file in each sub-directory with the name of the csv file in that directory - so for T:\Projects\Final I have an empty text file as no csv files here, but what I need is a single text file that contains the filename and path for each csv file contained within T:\Projects\Final.
>>
>> Once I have this, my aim is to use the filenames and paths stored in the text file and to combine each csv file into one file.
>>
>> If anyone has a more elegant method of appending all csv files that are stored within sub-directories of a folder then I'd be grateful to hear!
>>
>> Best wishes
>>
>> Tim
>>
>> *********************************************************************
>> *
>> **** The information contained in the EMail and any attachments is 
>> confidential and intended solely and for the attention and use of the 
>> named addressee(s). It may not be disclosed to any other person 
>> without the express authority of Public Health England, or the 
>> intended recipient, or both. If you are not the intended recipient, 
>> you must not disclose, copy, distribute or retain this message or any 
>> part of it. This footnote also confirms that this EMail has been 
>> swept for computer viruses by Symantec.Cloud, but please re-sweep any 
>> attachments before opening or saving. http://www.gov.uk/PHE
>> *********************************************************************
>> *
>> ****
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
>
> **********************************************************************
> **** The information contained in the EMail and any attachments is 
> confidential and intended solely and for the attention and use of the 
> named addressee(s). It may not be disclosed to any other person 
> without the express authority of Public Health England, or the 
> intended recipient, or both. If you are not the intended recipient, 
> you must not disclose, copy, distribute or retain this message or any 
> part of it. This footnote also confirms that this EMail has been swept 
> for computer viruses by Symantec.Cloud, but please re-sweep any 
> attachments before opening or saving. http://www.gov.uk/PHE
> **********************************************************************
> ****
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

**************************************************************************
The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE
**************************************************************************

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

**************************************************************************
The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE
**************************************************************************

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index