[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

From |
"Nick Cox" <n.j.cox@durham.ac.uk> |

To |
<statalist@hsphsun2.harvard.edu> |

Subject |
st: RE: suggestion for missing() |

Date |
Mon, 15 Sep 2008 18:13:56 +0100 |

In case it gets lost, I'll stick in here a reminder that -dropmiss- exists to do what Jeph does in his examples. -search- for locations. On the main point: I've wanted something like this more than once, so I sympathise. Whether this is really a good idea I don't know. It may look cosmetic, but it is rather a fundamental change to Stata's syntax, and it would introduce a diversity of allowable syntaxes when consistency is arguably a very good thing. If this were done, then it should be done consistently across similar functions such as -max()- and -min()- as well. Jeph, however, I think introduces some red herrings here. Choice of terminology confuses the several intersecting issues. Some of the fault is Stata's in that when -egen- was introduced the members of its family were called -egen- functions. I don't have a better name to suggest, but I think this similarity has been widely (although not deeply) confusing. First off, note that despite similar names functions and -egen- functions are really quite different beasts. Stata's functions are not that different from functions in many other languages, but -egen- functions are very idiosyncratic. The name really is exact: -egen- functions work __only__ with -egen-. Jeph mentions -rowmiss()- and -rowtotal()- and calls them row operators. They are, strictly, -egen- functions. The fact that they are defined to work across rows, meaning strictly observations, is just that, a fact. -egen- functions could have any syntax for their argument that you wanted. Some syntaxes would seem perverse but anything programmable is possible so long as it passes -egen-. Jeph then goes on to talk about column operators, but here his informal use of terminology becomes, potentially, rather misleading. Operators in most languages, although certainly not all, seem to be distinguished from functions largely by whether they are implemented via special symbols (e.g. + - * | &) or via names. That is an accident of implementation which we could ponder, but, keeping to the point, let me just underline that when Jeph says column operators I think he means Stata functions, strict sense. Such functions are not designed to work with columns, meaning strictly variables, or indeed anything in particular. They are designed to work with anything that satisfies their syntax. Whether I say -missing(1,2,3,4)- or -missing(a[1], a[2], a[3], a[4])- or -missing(x, y, z)- is all one to -missing()- so long as the arguments fit the syntax. The results in context will differ because the rest of Stata is so smart, but I think -missing()- is just a mindless machine. This is mostly just yet another plea to use Stata's terminology when discussing Stata! Nick n.j.cox@durham.ac.uk Jeph Herrin This is mostly a suggestion to StataCorp, perhaps it has been made or explained elsewhere. The function -missing()- is quite useful, but I'd like to propose that it be modified to take a -varlist- as argument. First, it would be even more useful if one could specify many variable names using short hand. Eg, why not drop if missing(q1-q23) or drop if missing(_all) ? Second, this would be consistent with other row operators such as -rowmiss- & -rowtotal-, which take varlists. At least, it seems like that is the Stata convention - row operators take varlists, column operators take comma separated lists. Perhaps I'm wrong on this, but it seems enough of a convention that I invariable try to stick a varlist in -missing()- anyway. * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: RE: suggestion for missing()***From:*Jeph Herrin <junk@spandrel.net>

**References**:**st: suggestion for missing()***From:*Jeph Herrin <junk@spandrel.net>

- Prev by Date:
**Re: st: changing ddmmyyyy to yyyymm** - Next by Date:
**RE: st: optimal bandwidth in Stata 8** - Previous by thread:
**st: suggestion for missing()** - Next by thread:
**Re: st: RE: suggestion for missing()** - Index(es):

© Copyright 1996–2014 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |