[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: RE: do-files as programs

From   "Martin Weiss" <[email protected]>
To   <[email protected]>
Subject   RE: st: RE: do-files as programs
Date   Fri, 26 Sep 2008 14:27:27 +0200

Actually, this is the first time I have come across a recommendation for
-program- instead of nested do-filing. I used to think that -program-s are
the more versatile tools that you would turn to to conduct analyses for
several projects, not just one. 
The naming issue simply requires you to -which- the command beforehand.
IMHO, that is not a major obstacle to the strategy described by Gabi.


-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of philippe van kerm
Sent: Friday, September 26, 2008 2:28 PM
To: [email protected]
Subject: RE: st: RE: do-files as programs

This is interesting.

I have the understanding that 1./2. is the 'preferred' approach, and that
-program-s should be kept for 'creating new commands', not for running
simple batches of commands (that being the purpose of 'do files').

One argument I see against using -program-s instead of multiple/nested
do-files is the risk of inadvertently redefining a command. Try for example,

  pr def ml
    di "Do Maria Lisa's tasks here... some data manipulation for example"

  pr def analysis
    ssc install dagumfit
    sysuse auto
    dagumfit price


This fails because -ml- is inadvertently redefining the official -ml-
command which is used internally by -dagumfit-. Not only is -dagumfit- not
working anymore, but -ml- is executed where it should not -- and this can be
a serious issue if -ml- does something unwanted to the data, for example.

This means that one would need to be careful and constantly check for name
conflicts whenever strategy 3./4. is adopted. So advising strategy 1./2.
(with do-files) seems  in general safer, in particular for novice users who
may not notice the perils.

But I'm be curious to read arguments for/against this claim. (The timing
argument reported here is one!)


> -----Original Message-----
> From: [email protected] [mailto:owner-
> [email protected]] On Behalf Of Gabi Huiber
> Sent: Thursday, September 25, 2008 10:48 PM
> To: [email protected]
> Subject: Re: st: RE: do-files as programs
> Martin, thank you for the pointers. For all they're worth, here are my
> findings:
> There are four basic ways, it seems to me, to organize a project in do-
> files:
> 1. You just make a list of instructions, executed one after another.
> If you need any of them executed more than once, put them inside a
> foreach loop. That's one do-file, and it will get as long and involved
> as the project demands it. It's the way we program when we are new to
> Stata.
> 2. You break up the problem into smaller do-files and have a master
> file call them call them as needed.
> Neither of the above makes use of the program feature. Do-files are
> read in as many times as they are used. 2. has certain advantages over
> 1. in ease of debugging and general readability, as short do-files are
> easier to pore over than long ones, but the project will have to rely
> on multiple inter-linked do-files instead of one. My guess is that the
> project doesn't have to be terribly complex before the advantages
> trump this drawback.
> 3. You make a do-file organized broadly as follows: in the first
> section you declare any programs you need, then in the second you
> invoke them as needed.
> 4. The programs at 3. above are saved as separate do-files, and a
> master file calls them in once with the "do" command, then
> executes them as many times as needed by invoking their name. So, both
> 3. and 4. do make use of this "program" feature.
> My test project: I had to tabulate one dummy variable in 30 different
> files, then save the matcells in a master matrix. To make it last a
> little, I ran the same thing twice. I organized the project in the
> four ways above:
> 1. One do-file with no programs in it;
> 2. Four separate do-files organized as a master calling the other
> three twice each;
> 3. One do-file with three programs in it, each invoked twice and
> finally
> 4. Four do-files, where the master was calling in the other three
> once, then invoking them each twice.
> The results are as follows: 3 was fastest at 6 seconds, followed by 4
> at 9 seconds or so. 2 and 1 were about equally bad, some 11-12 seconds
> each.
> This suggests that declaring do-files as programs increases
> productivity. I hope this helps somebody.
> Gabi


*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index