Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Preventing 'spreading' on merging files


From   Ernest Berkhout <ernestb@seo.fee.uva.nl>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Preventing 'spreading' on merging files
Date   Wed, 31 Mar 2004 12:39:08 +0200

At 12:28 31/03/2004, you wrote:
I have a data set with around 90,000 observations and I am merging it with
another dataset with around 3 million observations. However, when I merge the
two, the number of '3s' in the _merge variable exceeds the 90,000 in the master
file by a few thousand, indicating that there is more than one observation in
the using file for some of the observations in the working file.

Does anyone know if it is possible to prevent Stata from picking up the
additional observations from the using file (i.e., constraining the observations
merged to the 90,000 in the working file)?
Sounds like your key-variable is not unique in the using dataset, so some records in your master set match with more than one record in the using set, and therefor get duplicated.
Maybe you might want to take a look at mmerge.ado, which adresses these issues more directly then the built-in merge command.


Ernest Berkhout
SEO Amsterdam Economics
University of Amsterdam

Room 3.08
Roetersstraat 29
1018 WB Amsterdam
The Netherlands

tel.:+ 31 20 525 1657
fax:+ 31 20 525 1686
http://www.seo.nl
===========================
A statistician: someone who insists
on being certain about uncertainty
===========================

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index