[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: making Stata read do-files

From	"Sergiy Radyakin" <[email protected]>
To	[email protected]
Subject	Re: st: making Stata read do-files
Date	Wed, 16 Apr 2008 16:02:27 -0400

Hello Gabi,

1) I had some experince of pre-processing ado files before letting
Stata execute them, and would like to suggest you paying attention to
syntax errors in ado files. Lines containing unmatched single quotes
can cause Stata to stop with an error (even if those lines are never
executed). This can help sometimes, and since I process files
line-by-line I can even get the line number where the error occurs
(something Stata doesn't tell you, why?)

2) If you plan to store program lines as observations, the lines
should be short enough to fit 244 symbol restrictions.

3) It seems quite clear, how you add lines to your project, but how do
you remove them? Why wouldn't you store the daily snapshots in
archives? or incremental backups?

4) you write "drop duplicates in terms of cmd" Are you sure?

drop in 1
drop in 1
  is not equivalent to:
drop in 1

5) It is very interesting what you are doing, but doing it in Stata
can be unnecessarily difficult.

Best regards, Sergiy





On 4/16/08, Gabi Huiber <[email protected]> wrote:
> Thank you, everybody, for your suggestions.
>
> Alan's solution worked perfectly. He was right to assume that I would
> not want to rename files. And he was right to guess that I was running
> Windows. Next time I'll know to mention my OS.
>
> The larger problem I was trying to solve was this: go through a mess
> of directory paths and eventually find a load of do-files in each of
> them, saved weekly -- sometimes with names such as fileYYYYMMDD.do,
> and other times with names such as fileapr1608.do. Then read each of
> those do-files line by line, but don't interpret them. Instead, write
> each line to a .dta file as an observation in a variable called cmd
> (as in command). Next to it, write the date of the file that that line
> came from, in the format YYYYMMDD (because it reads well to the human
> eye and sorts chronologically), as the corresponding observation in a
> variable called date. Then drop duplicates in terms of cmd.
>
> The goal is twofold: I want to easily track changes made to the
> do-files over time, and I want to use these dta files to make Stata
> write its own do-files on the fly. If, for example, I want to
> reconstitute the weekly do-file saved on 20071231, I just keep all the
> observations in the dta file where date<=20071231.  As time goes by
> and people keep saving these weekly do-files, I just send Stata to
> scrape the directories anew and re-assemble the master dta file.
>
> I did not want to mess with the do-file names because other people
> still use them and I wanted to do my work with as little disruption to
> them as possible.
>
> Of course had my client used some kind of proper revision control
> system, like RCS in Unix, this effort would have been unnecessary. How
> do the Statalisters deal with revision control? Is there a
> Stata-specific good practices write-up on the matter? Might somebody
> present one at the Chicago meeting?
>
> Regards,
>
> Gabi
>
>
> On 4/15/08, Alan Riley <[email protected]> wrote:
> > Gabi Huiber is using the extended macro function -dir- under Windows
> >  and is having a problem with uppercase letters in filenames:
> >
> >
> >  > I am having a problem with uppercase letters in file names.
> >  >
> >  > I am trying to impose some version control on a bunch of do-files of
> >  > the type [filename]_yyyymmdd.do, that have been saved weekly with
> >  > various changes over the past two years or so.
> >  >
> >  > In the first stage of this job I just want to read their names, then
> >  > collect the yyyymmdd part in a matrix. But I am hitting a snag with
> >  > files where the alphanumeric part contains uppercase characters. My
> >  > code goes like this:
> >  >
> >
> > > [snip]
> >
> > >
> >  > local `k'list: dir "${dofrom}" files "`k'*.do"
> >
> > > local `k'clean: list clean `k'list
> >
> > > local `k'num: list sizeof `k'clean
> >
> > > di "`k'"
> >  > di ``k'num'
> >  >
> >
> > > [snip]
> >
> >  Gabi has some files that begin with the prefix "DNAreport_", but when
> >  the local macro `k' above contains that prefix, nothing is put into
> >  the local macro ``k'list' by -dir-.  That results in the final two
> >  lines of the code displaying
> >
> >  > DNAreport_
> >  > 0
> >
> >  and a subsequent part of Gabi's code that does not expect to see
> >  0 files fails with an error.
> >
> >
> >  Let me first explain what Gabi can do in the code above to fix it
> >  for now.  After that I'll explain what is happening and what we
> >  (StataCorp) should do in the future about this.
> >
> >  Gabi can replace the first line of the code above,
> >
> >
> >    local `k'list: dir "${dofrom}" files "`k'*.do"
> >
> >
> > with the following two lines:
> >
> >    local lowerk = strlower("`k'")
> >    local `k'list: dir "${dofrom}" files "`lowerk'*.do"
> >
> >  The -dir- extended macro function, under Windows, sees all filenames
> >  as lowercase.  So, Gabi can lowercase the pattern which -dir- is using
> >  to match filenames, and then -dir- will return the list of files Gabi
> >  needs.
> >
> >  Read on only if you are interested in some technical details.
> >
> >  Why does -dir- work this way under Windows?
> >
> >  Windows does not have true case-sensitive filenames.  Macintosh OS X
> >  and Unix both have true case-sensitive filenames, so there is no issue
> >  with those operating systems.
> >
> >  In Windows, a file named
> >
> >    DNAreport_1.do
> >
> >  is no different than a file named
> >
> >    dnareport_1.do
> >
> >  or
> >
> >    DNAREPORT_1.DO
> >
> >
> >  Gabi can see this in a Windows command window (not to be confused
> >  with Stata's Command window) by changing to the directory where the
> >  DNAreport_*.do files exist and typing
> >
> >    C:\somedirectory> dir DNA*
> >
> >    C:\somedirectory> dir dna*
> >
> >  In both cases, the same files will be shown.
> >
> >
> >  Stata knows that Windows does not care about the case of filenames.
> >  There are routines in Stata that need to make a list of all filenames
> >  of a certain type (think -adoupdate-), and under Windows, those
> >  routines need to be assured that they are finding, say, *.ado files,
> >  no matter whether Windows has stored some of them with capital letters
> >  in their names or not.
> >
> >
> >  However, there are cases where users such as Gabi do care about
> >  the case of filenames under Windows, and Stata needs to support that.
> >  In a future update we will add a Windows-only option to the -dir-
> >  extended macro function (perhaps 'respectcase').  Without the option,
> >  -dir- will behave as it does now.  With the option, -dir- will pay
> >  attention to case when matching files, and the list of files returned
> >  by -dir- will preserve the case of the filenames in Windows.
> >
> >
> >
> >  --Alan Riley
> >  ([email protected])
> >
> > *

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: making Stata read do-files
  - From: "Gabi Huiber" <[email protected]>
- Re: st: making Stata read do-files
  - From: Alan Riley <[email protected]>
- Re: st: making Stata read do-files
  - From: "Gabi Huiber" <[email protected]>

Prev by Date: Re: st: making Stata read do-files
Next by Date: RE: Re: st: Dependent continuous variable with bounded range
Previous by thread: Re: st: making Stata read do-files
Next by thread: st: Re: unbundling gamfit.exe to c:\ado\
Index(es):
- Date
- Thread