[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
st: RE: _merge
Dr. Frederick Wolfe
> In agreement with Nick, the program does work as directed.
> However, a case should be made against the long held Stata
> position that _merge is inviolable.
> In situations where one repeatedly merges data sets with
> known characteristics, the old _merge variable is a major
> nuisance. One so often forgets to drop it only to have your
> program crash.
> One way to solve such a problem (as I perceive it to be) is
> to put an option in -merge- that allows dropping _merge if
> it already exists. That is, you would have to deliberately
> use the option. For example, option = Drop_merge:
> merge a b using c,dr
> That would keep everyone happy and would prevent problems
> like those described below.
> In fact, I would even go further and make dropping _merge
> the default, for if you really want to keep the variable
> you should rename it.
When I just wrote (about -merge-, in effect) it entered my head
to say one more thing, yet I forgot. Now that one more thing is
-merge- does not, as it may be thought I implied or Fred
implies, absolutely insist that you use _merge as the created
variable name. You can specify your own, and it's often
a good idea.
The default behaviour of -merge- is nevertheless there for your
benefit. For every highly experienced Stata user like Fred, who
despite vast experience nevertheless forgets this occasionally,
there will be many more less experienced users who will forget this
more frequently --- and some of those times they may be
protected against some serious mistake to do with their
Now as for Fred's suggestion: I can't see Stata Corp
implementing this, but even if they are tempted
I am going to beg them not to. In addition, this
just will not keep "everyone happy" -- especially,
I am astonished at the suggestion of changing
the default behaviour of -merge-. How many ado
files or do files is that going to break? (Even under
version control.) What if you missed the announcement???
As soon as dropping the merge variable were made available,
some users would
get into the habit of doing it often, or even
always, mistakenly believing _merge to be somewhere between an
irrelevance and dirt on a shoe, and the total consequences of
that would be a mess. And Stata Corp would be blamed for providing
loaded guns which users shot themselves with.
In any case, why does Fred want Stata Corp to provide this?
It can be implemented easily in a user-written wrapper to -merge-:
< code easy, but deliberately not supplied >
but then you have only yourself to blame if things
P.S. one more detail, but it's pertinent : a key option like this
be implemented by a short cryptic option name. Some other
commands now insist that you spell out all of an evocative option
name, such as -force-, when that's what you're doing.
-dangerously- might be the name to use here.
There is a major difference between (1)
. drop _merge
. merge ...
. merge a b using c, dr
(1) is, or should be, the result of your thought.
Whether it is in fact the right thing to do is a
matter for you and your dataset. If you typed a
silly thing, it's your fault.
(2) could all too easily be "the incantation
which you use to get where you want to be".
- st: _merge
- From: "Dr. Frederick Wolfe" <email@example.com>