Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: AW: AW: unstable results with repeating the nearmrg command


From   "Martin Weiss" <[email protected]>
To   <[email protected]>
Subject   st: AW: AW: unstable results with repeating the nearmrg command
Date   Tue, 11 Aug 2009 13:58:41 +0200

<> 


Also see http://www.stata-journal.com/sjpdf.html?articlenum=dm0019



HTH
Martin


-----Ursprüngliche Nachricht-----
Von: [email protected]
[mailto:[email protected]] Im Auftrag von Martin Weiss
Gesendet: Dienstag, 11. August 2009 13:58
An: [email protected]
Betreff: st: AW: unstable results with repeating the nearmrg command


<> 

Could this possibly have anything to do with the -stable- option for -sort-?



HTH
Martin

-----Ursprüngliche Nachricht-----
Von: [email protected]
[mailto:[email protected]] Im Auftrag von John Hund
Gesendet: Montag, 10. August 2009 23:02
An: [email protected]
Betreff: st: unstable results with repeating the nearmrg command

I am having a very perplexing problem with the nearmrg command...it  
seems to give different results on subsequent runs with the same  
data.  In addition, my co-author and I get different results on the  
the same datasets, similarly sorted.  An example of the problem is  
below, using a very small (5 observation) dataset.  The two datasets  
are ageinfo1 and ageinfo2:

ageinfo1
      +----------------------------+
      | id   gender   age   income |
      |----------------------------|
   1. |  4        1    12       56 |
   2. |  3        1    25       21 |
   3. |  1        1    34       23 |
   4. |  5        2    18       75 |
   5. |  2        2    40       43 |
      +----------------------------+
Note that ageinfo1 is sorted by gender and age, and doesn't contain  
any duplicate values.

ageinfo2
      +-----------------------------+
      |  id   gender   income   age |
      |-----------------------------|
   1. | 415        1       12    12 |
   2. | 314        1       32    25 |
   3. | 516        2       65    18 |
   4. | 213        2       32    40 |
   5. |  12        2       12    34 |
      +-----------------------------+
Not necessary to be sorted, but I subsequently sort this file to  
facilitate replication. Then issuing the following commands in order  
gives:

. use ageinfo2

. sort gender age

. nearmrg gender using ageinfo1, nearvar(age) lower genmatch(newage)

. list

      +-----------------------------------------------+
      |  id   gender   income   age   _merge   newage |
      |-----------------------------------------------|
   1. | 415        1       12    12        3       12 |
   2. | 314        1       32    25        3       25 |
   3. | 516        2       65    18        3       18 |
   4. |  12        2       12    34        3       18 |
   5. | 213        2       32    40        3       40 |
      +-----------------------------------------------+

. clear

. use ageinfo2

. sort gender age

. nearmrg gender using ageinfo1, nearvar(age) lower genmatch(newage)

. list

      +-----------------------------------------------+
      |  id   gender   income   age   _merge   newage |
      |-----------------------------------------------|
   1. | 415        1       12    12        3       12 |
   2. | 314        1       32    25        3       12 |
   3. | 516        2       65    18        3       18 |
   4. |  12        2       12    34        3       18 |
   5. | 213        2       32    40        3       40 |
      +-----------------------------------------------+

. clear

. use ageinfo2

. sort gender age

. nearmrg gender using ageinfo1, nearvar(age) lower genmatch(newage)

. list

      +-----------------------------------------------+
      |  id   gender   income   age   _merge   newage |
      |-----------------------------------------------|
   1. | 314        1       32    25        3       25 |
   2. | 415        1       12    12        1        . |
   3. | 516        2       65    18        3       18 |
   4. |  12        2       12    34        3       18 |
   5. | 213        2       32    40        3       40 |
      +-----------------------------------------------+

The first outcome is correct, but subsequent runs give different (and  
incorrect) answers. My only guess at this point is that there is  
something going on with a temporary file which is not being cleared,  
but I don't know how that could happen. Has anyone else noticed a  
problem with this?

Thanks in advance,
John
=================================
John Hund
Visiting Assistant Professor
Jones Graduate School of Business
Rice University
Houston, TX 77005

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index