.- help for ^compdta^ (STB-28: dm36) .- Compare the contents of two Stata data sets ------------------------------------------- ^compdta^ [varlist] [^if^ exp] [^in^ range] using filename [^,^ ^k^eep^(^max|min^)^ ^l^ist ^n^oisily ^s^ort] Description ----------- ^compdta^ compares the data in memory (the master data) with the data in a Stata- format data set specified by filename (using data). If no varlist is specified, the comparison is performed for all variables in the master data. If no ^if^ or ^in^ clause is specified, the comparison is between all observations in the mas- ter data and all observations in the using data. Neither the master nor the using data set is permanently altered by the comparison. ^compdta^ uses the ^merge^ command to combine the master and using data sets, so Stata's ^maxvar^ parameter must be set a little larger than twice the number of variables in varlist, and the ^width^ parameter must be set a little larger than twice the width of the varlist; the ^noisily^ option will display the required settings for the maxvar and width parameters. Options ------- ^keep^ retains in memory a part of the data set created by ^merge^ to perform the comparisons; by default, this data set is replaced with the master data set at exit. ^keep(max)^ retains the largest relevant portion of the merged data: all observations specified by the ^if^ and ^in^ clauses, all varlist variables from the master data, and all varlist variables from the using data that contain mismatches. ^keep(min)^ retains the smallest relevant portion of the merged data: only those observations with at least one mismatch, only those varlist variables from the master and using data that contain mismatches, with all pairs of data values set to missing or blank when they agree. Both forms of ^keep^ create a variable ^_compdta^ that records the number of mis- matches in each observation. ^keep(min)^ also creates a variable ^_id^ that records the original observation number in the master data. ^list^ displays mismatched pairs of data values, along with their observation number, in side-by-side fashion. By default, only a count of mismatched observations is displayed. ^noisily^ expands the information reported, displays a warning when the master and using data disagree in sort order, and shows the minimum values of the ^maxvar^ and ^width^ parameters. The ^compress^ command is always applied to the master and using data prior to merging them; ^noisily^ also displays the ef- fects of the compression. ^sort^ attempts to sort the using data to agree with the sort order of the mas- ter data, if the master data are sorted. By default, the comparison is performed using the current ordering of both data sets. Examples -------- . ^compdta using backup^ (compare the data set in memory with backup.dta; terse or no output) . ^compdta pr* using site1, l s^ (compare all variables whose names begin with "pr" with their counterparts in site1.dta. List discrepant pairs of values in side-by-side fashion. Sort site1.dta to agree with the data in memory before comparing.) . ^compdta y1-y3 rate if time <= 10 using pilot, noi keep(max)^ (compare variables in the list "y1-y3 rate" with the same list of variables in pilot.dta, ignoring all observations where time > 10. Retain the compar- ison data set, and display notes about its construction.) Author ------ John R. Gleason Syracuse University 73241.717@@compuserve.com Also see -------- STB: STB-28 dm36 Manual: [4] memory, [5d] compress On-line: help for @merge@, @compress@, @memsize@