Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: _merge


From   "Nick Cox" <n.j.cox@durham.ac.uk>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: RE: _merge
Date   Tue, 26 Aug 2003 12:12:29 +0100

Dr. Frederick Wolfe
> 
> In agreement with Nick, the program does work as directed. 
> However, a case should be made against the long held Stata 
> position that _merge is inviolable.
> 
> In situations where one repeatedly merges data sets with 
> known characteristics, the old _merge variable is a major 
> nuisance. One so often forgets to drop it only to have your 
> program crash. 
> 
> One way to solve such a problem (as I perceive it to be) is 
> to put an option in -merge- that allows dropping _merge if 
> it already exists. That is, you would have to deliberately 
> use the option. For example, option = Drop_merge:
> 
> merge a b using c,dr
> 
> That would keep everyone happy and would prevent problems 
> like those described below.
> 
> In fact, I would even go further and make dropping _merge 
> the default, for if you really want to keep the variable 
> you should rename it.

When I just wrote (about -merge-, in effect) it entered my head 
to say one more thing, yet I forgot. Now that one more thing is 
relevant. 

-merge- does not, as it may be thought I implied or Fred 
implies, absolutely insist that you use _merge as the created 
variable name. You can specify your own, and it's often 
a good idea. 

The default behaviour of -merge- is nevertheless there for your 
benefit. For every highly experienced Stata user like Fred, who 
despite vast experience nevertheless forgets this occasionally, 
there will be many more less experienced users who will forget this
more frequently --- and some of those times they may be 
protected against some serious mistake to do with their
dataset. 

Now as for Fred's suggestion: I can't see Stata Corp 
implementing this, but even if they are tempted 
I am going to beg them not to. In addition, this 
just will not keep "everyone happy" -- especially, 
I am astonished at the suggestion of changing 
the default behaviour of -merge-. How many ado
files or do files is that going to break? (Even under 
version control.) What if you missed the announcement???

As soon as dropping the merge variable were made available, 
some users would 
get into the habit of doing it often, or even 
always, mistakenly believing _merge to be somewhere between an 
irrelevance and dirt on a shoe, and the total consequences of 
that would be a mess. And Stata Corp would be blamed for providing 
loaded guns which users shot themselves with. 

In any case, why does Fred want Stata Corp to provide this? 
It can be implemented easily in a user-written wrapper to -merge-: 

< code easy, but deliberately not supplied > 

but then you have only yourself to blame if things 
go wrong. 

Nick 
n.j.cox@durham.ac.uk 

P.S. one more detail, but it's pertinent : a key option like this 
would never 
be implemented by a short cryptic option name. Some other
commands now insist that you spell out all of an evocative option 
name, such as -force-, when that's what you're doing. 
-dangerously- might be the name to use here. 

There is a major difference between (1) 

. drop _merge 
. merge ... 

and (2)

. merge a b using c, dr 

(1) is, or should be, the result of your thought. 
Whether it is in fact the right thing to do is a 
matter for you and your dataset. If you typed a 
silly thing, it's your fault. 

(2) could all too easily be "the incantation 
which you use to get where you want to be". 

<<attachment: winmail.dat>>




© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index