Perhaps you could expand on this.
In the absence of mind-reading functionality in Stata,
(1) you need to tell it that 9 or 99 really means
missing, perhaps in special senses, if that is true.
(2) you have to keep it from interpreting 9 or 99 as
9 or 99, if it shouldn't.
(3) users themselves need to be straight on
which 9s and 99s mean 9 or 99 and which mean something else.
On (1), there were user requests for different types
of missing data in Stata for several years; perhaps the nuances
of "don't know" "didn't answer" "not applicable" etc.
didn't seem important enough to be attended to. But
eventually StataCorp gave in and we now have these multiple types.
On (2), it is easy enough to go
if !inlist(myvar, 9, 99)
or whatever, but also easy enough to forget
to do this _all the time_.
On (3), it should be clear that being inconsistent
here is a sure road to confusion. Despite many stories
of silly codings, e.g. missing ages being coded
999 and inflating the average age of drivers
in car accidents, these practices persist. However,
Stata is clearly not to blame.
What is Stata still missing on missing values?
Nick
n.j.cox@durham.ac.uk
Richard Williams
> As a sidelight, I can't say that I am real fond of Stata's
> way of handling
> MD. I work with secondary data all the time, and you'll
> routinely see MD
> codes of 9, 99, etc. I'm never seen another program or
> non-Stata data set
> that handles MD like Stata does.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/