I want to add something to my posting on the treatment of missing values in Mata functions. You may remember that I divided the numeric functions into three categories, (M) Mathematical functions (S) Statistical functions (U) Utility functions and then I said that (M) functions should handle missing values, it does not much matter what (S) does, and (U) functions do not allow missing values, or give a special meaning to them. I said that it does not much matter what a category (S) function does because good programming style is to make sure what is passed to them does not contain missing values, and that is easy to do. It is good programming style because, as the data are divided into separate, easy-to-use matrices, one subroutine might exclude one set of observations and another subroutine, another set. Remember, what distinguishes a category (S) function is that it works on raw data, and such data is invariably obtained from from st_data() or st_view(), where it is easy to exclude the missing values at the outset. Thus, I argued, although I did not explicitly say this, writing additional code in a category (S) program is probably a waste of time because 1. It is probably better that a category (S) function does not allow missing values, because otherwise, the user of the function may be lead into sloppy and dangerous habits. 2. Ben Jann <ben.jann@soz.gess.ethz.ch>, who asked the original question, said he was doing this by coding if (missing(x)) _error(3351) Good idea, except -missing(x)- can be expensive to calculate. The missing() function has to make a pass through the data, looking for missing values, to establish that there are not any. Hence, even though it is probably better that a category (S) function does not allow missing values, there is a cost to imposing that. So here is what I add now: In a category (S) function that does not accept missing values, it is acceptable to omit if (missing(x)) _error(3351) as long as the function does something ugly in the presence of missing values. The ugly action could be abort with error, or it could be a result with some or all missing values. As long as something ugly happens, the user of the function cannot be mislead. On the other hand, if the function that would return something that could be be misinterpreted as a valid result, one should probably include if (missing(x)) _error(3351) -- Bill wgould@stata.com * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

