Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: Access datasets created by date

To   <>
Subject   st: RE: Access datasets created by date
Date   Mon, 4 Apr 2005 14:53:14 -0400


    I asked the following question on Friday and received great help
which provides solution to my problem. Michael Blasnik, Austin Nichols,
and Jens Lauritsen actually wrote the program for me in different
flavors. I summarize the solutions below if anyone needs help on a
similar problem. I'm grateful to all of you. 

My Question:   
I have a list of text files in a folder which becomes available daily. I
would like to access these daily files by date in my program. Is it
possible in stata to identify the datasets created by date? My dataset
are named like the following:

Name				Date Modified
PARCEL.G1911V00		03/29/05
PARCEL.G1921V00		03/29/05
PARCEL.G1914V00		03/30/05
and so on.

In my program, I would only like to capture PARCEL.G1911V00 and
PARCEL.G1921V00 since they are created on the same date (03/29/05) and
ignore PARCEL.G1914V00 which I would like to capture on 03/30/05's run.

Nick Cox: 
I don't know an easy way to do this 
from within Stata. 

The details depending on what OS you are using, you could route the
results of a -dir- or -ls- to a file and process that. 

It may well be easier to you to write a script in your favourite 
scripting language (Perl, Python, Awk, whatever) to ensure that 
files are renamed so that the names show dates more transparently, 
and then to read into Stata files satisfying a given pattern. 
Naturally, your set-up may prohibit that. 

Alternatively, you could write something to 
work out the difference between the files
there now and last time you looked.

Austin Nichols's program:
prog def gettoday, rclass
 cap drop _all
 !del tempdir.txt
 !dir > tempdir.txt
 infix str date 1-10 str time 11-19 str size 20-38 str fname 39-80 using
tempdir.txt  gen fdate=date(date,"mdy")  keep if
fdate==date(c(current_date), "dmy")  keep if fname!="."  keep if
fname!=".."  forval i=1/`=_N' {
  if "`: di fname[`i']'"!="tempdir.txt" {
   local names="`names' `: di fname[`i']'"
 if "`names'"=="" {
  return local names "."
 else {
  return local names "`names'"

then type 
. qui gettoday
. ret li
and the list of today's file names will be visible in r(names).

Michael Blasnik's program:
program define dayfiles, rclass
version 8.2
syntax [, Filespec(str) IFDate(str)]
drop _all
if "`filespec'"=="" local filespec "*.*"
!dir `filespec' > mydir.txt
infix str x 1-80 using mydir.txt
split x
gen int date=date(x1,"mdy",2040)
drop if date==.
rename x5 filename
keep date filename
if "`ifdate'"!="" keep if date==`ifdate'
forval i =1/`=_N-1' {
 local f `"`f'`"`=filename[`i']'"' "'
if _N>0 local f `"`f'`"`=filename[_N]'"'"'
return local files `"`f'"'

Morten Andersen's program:
-------------------------------- BEGIN dirlist.ado

*! 1.3 MA 2004-10-06 23:56:48
* saves directory data in r() macros fnames, fdates, ftimes, fsizes,
* used by dodoc.ado

program define dirlist, rclass

    version 8

    syntax anything
    tempfile dirlist

    if "`c(os)'" == "Windows" {
        local shellcmd `"dir `anything' > `dirlist'"'

    if "`c(os)'" == "MacOSX" {
        local anything = subinstr(`"`anything'"', `"""', "", .)
        local shellcmd `"ls -lT `anything' > `dirlist'"'

    if "`c(os)'" == "Unix" {
        local anything = subinstr(`"`anything'"', `"""', "", .)
        local shellcmd `"ls -l --time-style='+%Y-%m-%d %H:%M:%S'"'
        local shellcmd `"`shellcmd' `anything' > `dirlist'"'

    quietly shell `shellcmd'

    * read directory data from temporary file
    tempname fh
    file open `fh' using "`dirlist'", text read
    file read `fh' line
    local nfiles = 0
    local curdate = date("`c(current_date)'","dmy")
    local curyear = substr("`c(current_date)'",-4,4)
    while r(eof)==0  {
        if `"`line'"' ~= "" & substr(`"`line'"',1,1) ~= " " {

            * read name and data for each file

            if "`c(os)'" == "MacOSX" {
                local fsize : word 5 of `line'
                local fda   : word 6 of `line'
                local fmo   : word 7 of `line'
                local ftime : word 8 of `line'
                local fyr   : word 9 of `line'
                local fname : word 10 of `line'
                local fdate =  ///
                    string(date("`fmo' `fda' `fyr'","mdy"),"%dCY-N-D")

            if "`c(os)'" == "Unix" {
                local fsize : word 5 of `line'
                local fdate : word 6 of `line'
                local ftime : word 7 of `line'
                local fname : word 8 of `line'

            if "`c(os)'" == "Windows" {
                local fdate : word 1 of `line'
                local ftime : word 2 of `line'
                local word3 : word 3 of `line'
                if upper("`word3'")=="AM" | upper("`word3'")=="PM" {
                    local ftime "`ftime'-`word3'"
                    local fsize : word 4 of `line'
                    local fname : word 5 of `line'
                else {
                    local fsize : word 3 of `line'
                    local fname : word 4 of `line'

            local fnames "`fnames' `fname'"
            local fdates "`fdates' `fdate'"
            local ftimes "`ftimes' `ftime'"
            local fsizes "`fsizes' `fsize'"
            local nfiles = `nfiles' + 1


        file read `fh' line
    file close `fh'
    return local fnames `fnames'
    return local fdates `fdates'
    return local ftimes `ftimes'
    return local fsizes `fsizes'
    return local nfiles `nfiles'

* end

-------------------------------- END dirlist.ado
-------------------------------- BEGIN dirlist.hlp

{* 2004-03-12 15:57:14}{...}
help for {hi:dirlist} {right: (version 1.3, 2004-10-06)}

{title:Retrieve directory information}

{p 4 13 2}{cmd:dirlist} [{it:filespec}]


{p 4 4 2}
{cmd:dirlist} is used as the {cmd:dir} command, but retrieves the
about files in in return macros (see below).

{p 4 4 2}
{it:filespec} may be any valid Windows, Unix, or Macintosh file path or
specification (see {hi:[U] 14.6 File-naming conventions}) and may
"{cmd:*}" to indicate any string of characters.

{p 4 4 2}
Directory data are written to a temporary file using shell commands
(Windows {cmd:dir} and Mac OS X or Unix {cmd:ls}) and subsequently read
the program. 

{p 4 4 2}
Mac OS X: Spaces in the {it:filespec} should be preceded by an escape
character "{cmd:\}".


{p 4 8 2}
{cmd:. dirlist dm50*.do}

{p 4 4 2}
You can then access the returned results:

{p 4 4 2}
{cmd:. return list}

            r(nfiles) : "4"
            r(fsizes) : "814 209 296 493"
            r(ftimes) : "13:27:15 13:29:05 12:22:01 13:41:09"
            r(fdates) : "2003-10-30 2003-10-30 2003-10-30 2003-10-30"
            r(fnames) : ""
{p 4 8 2}
{cmd:. dirlist ~/DM\ data/dm50*.do} {it:(Mac OS X, space in directory


{p 4 4 2}
The ado-file has been tested on Mac OS X, Windows XP and one type of
Possible problems could occur caused by the layout of directory lists
regarding column arrangement and date format.


{p 4 4 2}
Morten Andersen, Research Unit for General Practice{break}
University of Southern Denmark, Denmark{break}

{title:Also see}

{p 4 13 2}
Online:  help for
{help dir},
{help shell},
{help return}

-------------------------------- END dirlist.hlp

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index