Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: merging datasets and getting different N in resulting dataset if I run several times


From   Woolton Lee <finished07@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: merging datasets and getting different N in resulting dataset if I run several times
Date   Fri, 14 Aug 2009 11:46:54 -0400

I am using 6 id variables to merge, all of which were character and I
have now converted them to numeric.  Even after I switch to stable
sorting and numeric id variables the problem persists though it is
smaller in magnitude - previously I was getting differences of 5-9 obs
in the resulting dataset between different runs, now its more like 1.
The id variables do not uniquely identify observations in the master
or using dataset.  Having said that it seems to me that this should
not occur at all even if there is duplication, and I am at a loss to
understand why its occurring.  A colleague of mine suggested there is
a bug in the merge command, I am using STATA 9.

Woolton

On Fri, Aug 14, 2009 at 10:44 AM, Austin Nichols<austinnichols@gmail.com> wrote:
> Woolton Lee<finished07@gmail.com> :
> Probably due to unstable sorting; without further info, hard to diagnose.
> Do you have missing values in any of the merge vars?
> This is a potentially very serious problem; see e.g.
> #4 in http://www.princeton.edu/~jrothst/hoxby/rejoinder.pdf
>
> On Fri, Aug 14, 2009 at 10:28 AM, Woolton Lee<finished07@gmail.com> wrote:
>> Hi I am getting a problem where I am merging two datasets together and
>> the N in the resulting dataset can change if I rerun the program 2 or
>> more times.  I am merging by company code (COCODE) and year which do
>> not uniquely identify observations in the using dataset, but it seems
>> to me that that should not matter.  I get the same result if I use the
>> joinby command - the resulting N in the dataset changes if I rerun the
>> program.  I am trying to understand why this might happen and am
>> stumped at the moment.  Does anyone have any suggestions?
>>
>> W
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index