[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

RE: st: To Stata programmers - All programs should provide references

From   "Nick Cox" <>
To   <>
Subject   RE: st: To Stata programmers - All programs should provide references
Date   Fri, 23 Nov 2007 18:08:57 -0000

I agree with Maarten's very sensible comments. 

More can be said, both specifically and generally, in 
response to Tiago's suggestions. 

I just don't agree that _all_ programs should be referenced
or provide explanation of methods. 

1. Some don't need any link to the literature (e.g. some 
self-explanatory data management tools). 

2. Some have backup in publications e.g. in the SJ or STB. 
This was the case for -genhwcci-, which Tiago gave as 
an example, perhaps unfairly in view of the original STB 

3. It might fairly be argued in some cases that if you 
don't know the literature you shouldn't be using the program. 
Or, similarly, many programmers would take it as fair expectation
that _you_ find the literature. Shucks, they wrote the program
and if it looks trivial feel free to write your own! 

That said, I am sure that a systematic trawl would find 
several cases in which references or methods sections within help files
would help many users a lot, and to that extent I agree
with Tiago. Indeed I have a habit of using a help file
as a zeroth draft of a paper I intend to write some time
and using the help as a store for my personal notes on some technique. 

Tiago in effect raises a very good general question: 
What are the standards to be expected of user-written 
software? In partial reply, 

1. No software can be trusted 100%. A new bug or misfeature
is discovered about daily in official Stata (see -help 
whatsnew- for those documented) and that is tested much, 
much more than most user-written software. 

2. I don't think it is going to help anybody to try to suggest 
formal criteria for an informal process. It is open to anybody 
to publish their own test script or validation results with their
and that is surely welcome, but rightly or wrongly the established 
pattern (especially for SSC) does not include that. Anyone who wants
that as a rule 
is welcome to start a site on that basis. 

3. Typically I read the message of a new package on SSC as 
"I wrote this and I have banged on it enough to think it worth 
making public. You are welcome to use it if it suits your problem. 
I would appreciate any comments on bugs or other problems." 
This may sound sloppy but there is plenty of evidence that more
people gain by this than lose. Of course, it is open to anyone 
to be paranoid about software they have not triple-checked themselves. 
But a program can jump all the hurdles on a long and detailed test
script and fail the next hurdle in which a new user thinks of 
some input that the programmer didn't imagine. As Maarten emphasised, 
I think software is best tested by being used for real. 

4. Tiago has answered his own question in emphasising that often 
you just have to look at the code; regardless of the documentation, 
I would add. 


Maarten buis

--- wrote:
> First, most packages (at least those I have used) fail to provide 
> references and detail on how their estimates are obtained (formulae).
> For instance, the program -genhwcci- estimates a standard error for a 
> estimate called disequilibrium coefficient (D), but no reference is 
> given to show to the users where the formulae come from.

It is not uncommon for the help-file to contain a section called
references. I agree with Tiago that adding such a section is good
> Second, before being available for general users, Stata programs (i.e.

> those availabe in SSC or elsewhere) are extensively validated?
> As a reviewer, should one trust 100% in new user written Stata 
> programs?

That depends on the programmer. I try checking my software against two
examples / datasets, and against a simulation, but I don't always live
up to my good intentions. Even if I do all the test, errors have still
seeped in. I would not know how to enforce rules on software validation
for user written software. Even if that were possible I don't know
whether that would be a good idea. The current situation means a larger
burden for the user, in the sense that they have to asses whether or not
to trust the results. However I would be worried that any attempt to
regulate user written software would just diminish production of
software. Sometimes, a user written program needs to grow with input
from other users. A more useful approach would be a list of tips and
tricks on software validation. 

*   For searches and help try:

© Copyright 1996–2017 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index