Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: comparing two .dta files


From   "Sergiy Radyakin" <serjradyakin@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: comparing two .dta files
Date   Tue, 12 Feb 2008 15:05:39 -0500

Dear Gabi,

why wouldn't you just byte-compare the two saved files ignoring the
differences in the date/time (which is also saved in the file)? This
will make sure that the files (data/labels/notes/etc) are exactly the
same.

Regards, Sergiy



On 2/12/08, Gabi Huiber <ghuiber@gmail.com> wrote:
> Dear all,
>
> Maybe I'm going about this the wrong way, but here's what I'm trying to do:
>
> I was recently given a do-file that insheets a bunch of files, works
> on them, then produces a .dta file 37 variables wide and several
> thousand observations long. Let's call this do-file the prototype
> version.
>
> I had to rewrite this do-file to make it more portable (e.g. by using
> global macros to define file paths, names, etc.) and more expandable
> (e.g. various parameters that are now typed in could be accessed from
> a separate .dta file; the length of this file would change as new
> values for those parameters were added over time). So I did. Let's
> call the new file the production version.
>
> Now I am trying to make sure that the prototype and the production
> versions produce the exact same 37-variable .dta file.
>
> I gathered the variable names in a local macro as follows:
>
> unab mergeby: _all
>
> I then sorted and merged the two files by the variables in `mergeby,'
> and found that _merge wasn't at all equal to 3 everywhere, so I had to
> investigate some more. That's where I hit trouble.
>
> I thought I would do this:
>
> drop if _merge==3
> sort _merge `mergeby'
> rename _merge j
> bysort j: gen id=_n
> reshape wide `mergeby', i(id) j(j)
>
> At this point, the file is 37*2+1 variables wide. I would find it
> useful to declare two new macros, one with the `mergeby' variables
> with the suffix 1, the other with the `mergeby' variables with the
> suffix 2.
>
> Since the local `mergeby' has 37 words in it, you would think that
> `mergeby1' and `mergeby2' would also have 37 words each, based on the
> code below:
>
> local mycount: word count `mergeby'
> di `mycount'
> forvalues k=1/2 {
> local mergeby`k' ""
> }
> forvalues i=1/`mycount' {
> local var: word `i' of `mergeby'
> forvalues k=1/2 {
> local mergeby`k'="`mergeby`k'' `var'`k'"
> }
> }
> forvalues k=1/2 {
> local x: word count `mergeby`k''
> di `x'
> }
>
> However, they are only 22 words long. I have no idea why.
>
> Are there any word count restrictions when trying to gather local
> macros word-by-word like I am trying to do?
>
> Is there some more elegant way to compare two files and isolate any differences?
>
> Thank you,
> Gabi
> *
> *   For searches and help try:
> *   http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index