Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Comparing two data set


From   Nick Cox <[email protected]>
To   [email protected]
Subject   Re: st: Comparing two data set
Date   Sat, 5 Mar 2011 19:42:21 +0000

This illustrates what I had in mind:

sysuse bpwide, clear
gen dataset = 1
save bpwide_1, replace

sysuse bpwide, clear
replace sex=abs(sex-1) if mod(patient,13)==0
replace agegrp=2 if agegrp != 2 & mod(patient,11)==0
replace bp_before=bp_after if patient==100
replace bp_after=145 if patient==100
input
121 1 1 120 119
end
replace patient=2 if _n==1
replace patient=1 if _n==2
gen dataset = 2
save bpwide_2 , replace

append using bpwide_1

duplicates tag patient sex agegrp bp*, gen(tag)
sort patient dataset
edit if tag != 1

In a real case, the two supposedly identical datasets would already
exist. The heart of
the technique would be (supposing an identifier -id-)

use dataset1, clear
gen dataset = 1
save dataset_1

use dataset2, clear
gen dataset = 2
save dataset_2

append using dataset_1
ds dataset, not
duplicates tag `r(varlist)', gen(tag)
sort id dataset
edit if tag != 1

The technique extends to multiple identifiers or to datasets lacking
identifiers (create one first using observation numbers). The
-dataset- identifier would naturally need to be a variable name not in
use. Also, this is only the start: it shows discrepant observations.
They now have to be fixed.

Nick

On Sat, Mar 5, 2011 at 2:20 PM, Dirk Enzmann
<[email protected]> wrote:
> In reply to
>
>
> http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/statalist.1103/date/article-326.html
>
> -------------------------------------------------------------
>
> It did not intend to argue that using -merge- is superior to -append- but
> simply tried to answer Rajaram's question by demonstrating how his problem
> could be solved by using official commands. I know how to use -merge- to
> achieve this, but I don't know how to use -append- do achieve the same.
>
> Can you show (perhaps using the example data I did create) how this can be
> done using -append-?
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index