Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: _merge complaint


From   Jeph Herrin <junk@spandrel.net>
To   statalist@hsphsun2.harvard.edu
Subject   st: _merge complaint
Date   Fri, 21 Mar 2008 11:39:41 -0400

The manual entry for -merge- (Stata 10) describes
the contents of _merge when more than one -using-
dataset is used:

 "_merge is the standard result variable that
 we have discussed before: 1 means that the observation
 came from the master, 2 means that it came from the
 using, and 3 means that it came from both."

While I think the behaviour described here for _merge is what
I would expect  - "3 means it came from both [master and using]"
- this isn't in fact what _merge contains.  The example indicates
otherwise, as does the online documentation:

 _merge==3    obs. from at least two datasets, master or using

That is, _merge==3 could mean it came from the two usings, but
not the master at all.

My complaint is two-fold. First, the online and printed documention
should agree (I hope this is uncontroversial!). Second, I think it
would be more logical for _merge to have consistent behaviour
whether there are 1 or 2 using files - it should be 3 only if
there is a master and (at least one) using record. The terminology
employed by -merge- ("master" and "using") gives decided precedence
to the "master" dataset, so why not _merge?. Then _merge1, _merge2
etc can clarify which using set the obs came from.


Jeph




*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index