On Thursday, August 28, 2003, at 02:33 AM, John wrote:
What is the difference between the way .merge and .joinby work?  I've 
been
using joinby because it appears to work the same way relational 
databases
do, and I'm familiar with that concept.
That is correct--joinby forms the Cartesian product (outer join), which 
users of RDBMS are exhorted to avoid at all costs (run a proposed 
SELECT statement with an outer join by your DBA and see what s/he 
says). You practically never really want a Cartesian product, which 
generates a row (observation) for every defined combination of the two 
sets (in Stataese, the master and using dataset). More usually, you 
want to somehow match the observations in the using dataset with the 
master dataset -- with a one-to-one, one-to-many, or many-to-one merge. 
If you have about the same number of obs. in both datasets it would 
seem that you're really trying to do a one-to-one merge. joinby will 
not achieve that, but will generate a huge number of observations in 
the Cartesian product (about 450^2? 450^2 obs and 47 variables is quite 
a bit larger than 450 x 45).
Kit
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/