Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Christopher Baum <kit.baum@bc.edu> |

To |
"statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |

Subject |
Re: Re: st: Same code, same machine, same data, different results |

Date |
Thu, 6 Sep 2012 14:09:52 +0000 |

<> On Sep 6, 2012, at 2:33 AM, Dmitriy wrote: > > Do you have any m:m merges by any chance? > > DVM > > On Wed, Sep 5, 2012 at 2:10 PM, Mattia Landoni <mattia.landoni@gmail.com> wrote: >> Dear statalisters, >> >> a friend of mine has a bizarre problem. She is running a regression as follows: >> >> xi: regress a b c i.d i.e >> >> and her output is different every time. Has anyone ever seen a >> behavior like this? Below are some details. >> >> Environment: >> - Stata 11 >> - Windows 32-bit >> >> Precise description: >> The do-file imports several files from .csv, then merges them, then >> runs the regression. If I run the do-file, I get certain results. If I >> issue the same regression command again, I get again the same results, >> as it should be. However, if I re-run the do-file from the beginning, >> I get slightly different results and the regression even reports a >> slightly different number of observations. (Say, 2663 vs. 2666). Every >> time all the data are taken afresh from the same static .csv sources. >> There is nothing random about the do-file, that I know. The xi: >> command generates about 200 i-variables and a few, maybe 10, are >> dropped because of collinearity. There are more than 2500 >> observations. This is EXACTLY what happens when you do a m:m merge. (See IMEUS (Baum,2006), 3.7.2 for why you really shouldn't even try). I once spent 2 hours with one of my (very bright) grad students who was having this kind of problem in his do-file, with the old merge command, and we tracked it down to a non-unique merge key, in essence what is now called a m:m merge. I have had an exchange recently with a user on the LinkedIn Stata forum about this issue; he wanted to know whether Stata had 'fixed' the merge command in Stata 12 so that it did m:m merges correctly. I argued that there was no clear definition, in database terms, of what you are doing with a m:m merge, so no 'fix' would be forthcoming. He said he relied on SAS to do it, with PROC SQL, which perhaps has some hardwired rules about how to handle the innate indeterminacy of such an operation. KIt Kit Baum | Boston College Economics & DIW Berlin | http://ideas.repec.org/e/pba1.html An Introduction to Stata Programming | http://www.stata-press.com/books/isp.html An Introduction to Modern Econometrics Using Stata | http://www.stata-press.com/books/imeus.html * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

- Prev by Date:
**st: e(rmse) for xtreg, re ?** - Next by Date:
**Re: st: loops for regions** - Previous by thread:
**Re: st: Re: Same code, same machine, same data, different results** - Next by thread:
**st: Propensity Score Matching (PSM) - matching problem** - Index(es):