Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: re: changing numbers of observations


From   Kit Baum <baum@bc.edu>
To   statalist@hsphsun2.harvard.edu
Subject   st: re: changing numbers of observations
Date   Mon, 28 May 2007 10:40:42 -0400


Jesse said

I have a large do file that I've created to clean a dataset. It
includes some merging as well as forvalues commands. The first line of
code is use "filepath", clear

I've run it five times now and each time there has been a different
number of observations at the end (n=7639, 7637, 7645, 7629, and 7641)

It seems that if the do file starts with the same file and does the same
thing every time it should be returning the same observations every
time. Any ideas on what is happening here?


This is a FAQ. Merging without one or the other datasets having a unique merge key causes this problem. Section 3.7.2 of my book (URL below), "The dangers of many-to-many merges", discusses this issue.

Make sure you have unique keys on one side or the other, and use the uniqusing or uniqmaster options on merge to ensure that uniqueness is present.



Kit Baum, Boston College Economics and DIW Berlin
http://ideas.repec.org/e/pba1.html
An Introduction to Modern Econometrics Using Stata:
http://www.stata-press.com/books/imeus.html


*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index