Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Obtaining details about -merge, update-


From   Joe Canner <[email protected]>
To   "[email protected]" <[email protected]>
Subject   RE: st: Obtaining details about -merge, update-
Date   Fri, 18 Oct 2013 23:46:24 +0000

Before I left to go home today, I started a simple brute force approach to this problem that involved cloning all of the variables in the master data set and comparing them after the merge with the new values in the merged data set.  I need to work out a few details but it seems to do more or less what I want.  So, yes, it would require enough memory to duplicate the master data set.
________________________________________
From: [email protected] [[email protected]] on behalf of Sergiy Radyakin [[email protected]]
Sent: Friday, October 18, 2013 6:57 PM
To: [email protected]
Subject: Re: st: Obtaining details about -merge, update-

Phil, not exactly. Variable _merge tells me whether the whole
observation was matched, coming from the 'original' or 'using' data.
It's about variable-level updates. Something like:

varname    updated    replaced      original        total
----------------------------------------------------------------------
age            12             20               5028            5060
lastname    18             2                5040             5060
....
100 more vars or so depending on the data
.....
----------------------------------------------------------------------

Best, Sergiy Radyakin


On Fri, Oct 18, 2013 at 6:29 PM, Phil Clayton
<[email protected]> wrote:
> Perhaps I don't understand but isn't this what the _merge variable tells you?
>
> Phil
>
> On 19/10/2013, at 7:09, Joe Canner <[email protected]> wrote:
>
>> Dear Colleagues,
>>
>> Is there a relatively simple way to find out exactly what happened in the course of a -merge, update- command?  In other words: I have two datasets with a number of overlapping variables and I want to find out how often, for each variable, a missing observation in the master was updated with a non-missing observation in the using dataset.  Likewise, how often were observations in the master no updated because of a non-missing conflict.  Basically, this would be similar to the current merge results table, but on a variable-by-variable basis rather than based on the dataset as a whole.
>>
>> Of course, this same functionality would be useful for -merge, replace-, although that is not my present concern.
>>
>> If the answer is "no", is this something that people would be interested in?
>>
>> Regards,
>> Joe Canner
>> Johns Hopkins University School of Medicine
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index