Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | David Hoaglin <dchoaglin@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: Puzzling error with merge |
Date | Mon, 13 Jan 2014 16:11:30 -0500 |
Phil (and Joe), Thanks for the suggestions. I received an Excel file containing two sheets and imported them into separate Stata data files. Those are the files that I was trying to merge, treating the file from Sheet 1 as the master. On each of those Stata data files, I used -tabulate- with StudyID as a categorical variable (its type is string). I expected the output to show a frequency of 2 (or more) for any duplicate, but each value of StudyID had a frequency of 1 in both files, and the total frequency was correct in both files. I'll look for extra blank observations at the end. Multiple blank values of StudyID would certainly show up as duplicates in the -merge-. If the observations are completely blank, I'm not sure how to recognize them. Perhaps the numerical variables in the files will appear as missing. David Hoaglin On Mon, Jan 13, 2014 at 11:47 AM, Phil Schumm <pschumm@uchicago.edu> wrote: > On Jan 13, 2014, at 8:04 AM, David Hoaglin <dchoaglin@gmail.com> wrote: >> My key variable has no missing values in either file. > > > Is that statement based on what you know to be true from the original data (i.e., as you scan through them in the Excel file), or based on the results of > > assert !mi(StudyID) > > It is often the case that data imported from Excel contain extra (blank) observations at the end. I don't mean to question the veracity of your statement, but given what you've said, the existence of some extra observations in the master file with missing values for StudyID appears to be the only possible explanation (unless Joe is right, and you accidentally skipped over a duplicate in the output from -tabulate-). > > > -- Phil * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/