From
"Sergiy Radyakin" <serjradyakin@gmail.com>

To
statalist@hsphsun2.harvard.edu

Subject
Re: st: comparing two .dta files

Date
Tue, 12 Feb 2008 15:05:39 -0500

Dear Gabi, why wouldn't you just byte-compare the two saved files ignoring the differences in the date/time (which is also saved in the file)? This will make sure that the files (data/labels/notes/etc) are exactly the same. Regards, Sergiy On 2/12/08, Gabi Huiber <ghuiber@gmail.com> wrote: > Dear all, > > Maybe I'm going about this the wrong way, but here's what I'm trying to do: > > I was recently given a do-file that insheets a bunch of files, works > on them, then produces a .dta file 37 variables wide and several > thousand observations long. Let's call this do-file the prototype > version. > > I had to rewrite this do-file to make it more portable (e.g. by using > global macros to define file paths, names, etc.) and more expandable > (e.g. various parameters that are now typed in could be accessed from > a separate .dta file; the length of this file would change as new > values for those parameters were added over time). So I did. Let's > call the new file the production version. > > Now I am trying to make sure that the prototype and the production > versions produce the exact same 37-variable .dta file. > > I gathered the variable names in a local macro as follows: > > unab mergeby: _all > > I then sorted and merged the two files by the variables in `mergeby,' > and found that _merge wasn't at all equal to 3 everywhere, so I had to > investigate some more. That's where I hit trouble. > > I thought I would do this: > > drop if _merge==3 > sort _merge `mergeby' > rename _merge j > bysort j: gen id=_n > reshape wide `mergeby', i(id) j(j) > > At this point, the file is 37*2+1 variables wide. I would find it > useful to declare two new macros, one with the `mergeby' variables > with the suffix 1, the other with the `mergeby' variables with the > suffix 2. > > Since the local `mergeby' has 37 words in it, you would think that > `mergeby1' and `mergeby2' would also have 37 words each, based on the > code below: > > local mycount: word count `mergeby' > di `mycount' > forvalues k=1/2 { > local mergeby`k' "" > } > forvalues i=1/`mycount' { > local var: word `i' of `mergeby' > forvalues k=1/2 { > local mergeby`k'="`mergeby`k'' `var'`k'" > } > } > forvalues k=1/2 { > local x: word count `mergeby`k'' > di `x' > } > > However, they are only 22 words long. I have no idea why. > > Are there any word count restrictions when trying to gather local > macros word-by-word like I am trying to do? > > Is there some more elegant way to compare two files and isolate any differences? > > Thank you, > Gabi > * > * For searches and help try: > * http://www.stata.com/support/faqs/res/findit.html > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > * * For searches and help try: * http://www.stata.com/support/faqs/res/findit.html * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

