Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Obtaining details about -merge, update-


From   Robert Picard <picard@netbox.com>
To   "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu>
Subject   Re: st: Obtaining details about -merge, update-
Date   Sat, 19 Oct 2013 11:30:42 +0200

I think that -cf- can be used to find out how many times each variable
has been updated.

* -------------- begin example ------------------
sysuse auto, clear
isid make, sort
tempfile new
save "`new'"

replace price = . if price > 5000
replace rep78 = . if rep78 < 3
tempfile old
save "`old'"

merge 1:1 make using "`new'", update

* if you do not want -cf- to stop execution...
cap noi cf _all using "`old'"

* if you don't mind the error...
cf _all using "`old'"
* -------------- end example --------------------

On Sat, Oct 19, 2013 at 2:50 AM, Phil Clayton
<philclayton@internode.on.net> wrote:
> Oh sorry, I see now.
>
> Another option that would require more time but less memory than Joe's brute force approach would be to merge in the variables one by one (with the keepusing() option), and then after each merge rename _merge varname_merge. This would only require one byte per variable merged, but for large datasets might be too slow to be practical.
>
> Phil
>
> ------------------------------------
> * create a fake auto dataset with some new info for foreign & mpg
> sysuse auto, clear
> replace foreign =rbinomial(1, 0.5)
> replace mpg=rnormal(21, 6)
> keep make foreign mpg
> tempfile using
> save "`using'"
>
> * load the original auto dataset and merge in the new foreign & mpg variables
> sysuse auto, clear
> foreach var of varlist foreign mpg {
>         merge 1:1 make using "`using'", keepusing(`var') replace update
>         rename _merge `var'_merge
> }
> tab foreign_merge
> tab mpg_merge
> ------------------------------------
>
> On 19/10/2013, at 9:57 AM, Sergiy Radyakin <serjradyakin@gmail.com> wrote:
>
>> Phil, not exactly. Variable _merge tells me whether the whole
>> observation was matched, coming from the 'original' or 'using' data.
>> It's about variable-level updates. Something like:
>>
>> varname    updated    replaced      original        total
>> ----------------------------------------------------------------------
>> age            12             20               5028            5060
>> lastname    18             2                5040             5060
>> ....
>> 100 more vars or so depending on the data
>> .....
>> ----------------------------------------------------------------------
>>
>> Best, Sergiy Radyakin
>>
>>
>> On Fri, Oct 18, 2013 at 6:29 PM, Phil Clayton
>> <philclayton@internode.on.net> wrote:
>>> Perhaps I don't understand but isn't this what the _merge variable tells you?
>>>
>>> Phil
>>>
>>> On 19/10/2013, at 7:09, Joe Canner <jcanner1@jhmi.edu> wrote:
>>>
>>>> Dear Colleagues,
>>>>
>>>> Is there a relatively simple way to find out exactly what happened in the course of a -merge, update- command?  In other words: I have two datasets with a number of overlapping variables and I want to find out how often, for each variable, a missing observation in the master was updated with a non-missing observation in the using dataset.  Likewise, how often were observations in the master no updated because of a non-missing conflict.  Basically, this would be similar to the current merge results table, but on a variable-by-variable basis rather than based on the dataset as a whole.
>>>>
>>>> Of course, this same functionality would be useful for -merge, replace-, although that is not my present concern.
>>>>
>>>> If the answer is "no", is this something that people would be interested in?
>>>>
>>>> Regards,
>>>> Joe Canner
>>>> Johns Hopkins University School of Medicine
>>>>
>>>> *
>>>> *   For searches and help try:
>>>> *   http://www.stata.com/help.cgi?search
>>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>>> *   http://www.ats.ucla.edu/stat/stata/
>>>
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>> *   http://www.ats.ucla.edu/stat/stata/
>>
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index