Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: AW: Re: unstable results with repeating the nearmrg command


From   "Martin Weiss" <[email protected]>
To   <[email protected]>
Subject   st: AW: Re: unstable results with repeating the nearmrg command
Date   Tue, 11 Aug 2009 14:16:48 +0200

<> 


So you have to edit the .ado code to make this thing work properly? If so,
could you email the authors and let them know so they can update the file?



HTH
Martin


-----Ursprüngliche Nachricht-----
Von: [email protected]
[mailto:[email protected]] Im Auftrag von John Hund
Gesendet: Dienstag, 11. August 2009 14:12
An: [email protected]
Betreff: st: Re: unstable results with repeating the nearmrg command

Thanks Martin...

It actually does have to do with the stable option.  Right after the  
first append in the .ado file, the appended data actually could have  
(and in this case does have) duplicates for any exact matches, so the  
sort command is ambiguous.  Changing the lines:

append using `work'
sort `fullvars'

to

append using `work'
sort `fullvars', stable

fixes the problem. I'd encourage anyone who uses this to make the  
change!

Thanks again,
John
=================================
John Hund
Visiting Assistant Professor
Jones Graduate School of Business
Rice University
Houston, TX 77005



On Aug 10, 2009, at 4:02 PM, John Hund wrote:

>
> I am having a very perplexing problem with the nearmrg command...it  
> seems to give different results on subsequent runs with the same  
> data.  In addition, my co-author and I get different results on the  
> the same datasets, similarly sorted.  An example of the problem is  
> below, using a very small (5 observation) dataset.  The two  
> datasets are ageinfo1 and ageinfo2:
>
> ageinfo1
>      +----------------------------+
>      | id   gender   age   income |
>      |----------------------------|
>   1. |  4        1    12       56 |
>   2. |  3        1    25       21 |
>   3. |  1        1    34       23 |
>   4. |  5        2    18       75 |
>   5. |  2        2    40       43 |
>      +----------------------------+
> Note that ageinfo1 is sorted by gender and age, and doesn't contain  
> any duplicate values.
>
> ageinfo2
>      +-----------------------------+
>      |  id   gender   income   age |
>      |-----------------------------|
>   1. | 415        1       12    12 |
>   2. | 314        1       32    25 |
>   3. | 516        2       65    18 |
>   4. | 213        2       32    40 |
>   5. |  12        2       12    34 |
>      +-----------------------------+
> Not necessary to be sorted, but I subsequently sort this file to  
> facilitate replication. Then issuing the following commands in  
> order gives:
>
> . use ageinfo2
>
> . sort gender age
>
> . nearmrg gender using ageinfo1, nearvar(age) lower genmatch(newage)
>
> . list
>
>      +-----------------------------------------------+
>      |  id   gender   income   age   _merge   newage |
>      |-----------------------------------------------|
>   1. | 415        1       12    12        3       12 |
>   2. | 314        1       32    25        3       25 |
>   3. | 516        2       65    18        3       18 |
>   4. |  12        2       12    34        3       18 |
>   5. | 213        2       32    40        3       40 |
>      +-----------------------------------------------+
>
> . clear
>
> . use ageinfo2
>
> . sort gender age
>
> . nearmrg gender using ageinfo1, nearvar(age) lower genmatch(newage)
>
> . list
>
>      +-----------------------------------------------+
>      |  id   gender   income   age   _merge   newage |
>      |-----------------------------------------------|
>   1. | 415        1       12    12        3       12 |
>   2. | 314        1       32    25        3       12 |
>   3. | 516        2       65    18        3       18 |
>   4. |  12        2       12    34        3       18 |
>   5. | 213        2       32    40        3       40 |
>      +-----------------------------------------------+
>
> . clear
>
> . use ageinfo2
>
> . sort gender age
>
> . nearmrg gender using ageinfo1, nearvar(age) lower genmatch(newage)
>
> . list
>
>      +-----------------------------------------------+
>      |  id   gender   income   age   _merge   newage |
>      |-----------------------------------------------|
>   1. | 314        1       32    25        3       25 |
>   2. | 415        1       12    12        1        . |
>   3. | 516        2       65    18        3       18 |
>   4. |  12        2       12    34        3       18 |
>   5. | 213        2       32    40        3       40 |
>      +-----------------------------------------------+
>
> The first outcome is correct, but subsequent runs give different  
> (and incorrect) answers. My only guess at this point is that there  
> is something going on with a temporary file which is not being  
> cleared, but I don't know how that could happen. Has anyone else  
> noticed a problem with this?
>
> Thanks in advance,
> John
> =================================
> John Hund
> Visiting Assistant Professor
> Jones Graduate School of Business
> Rice University
> Houston, TX 77005

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index