Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: RE: Macro Producing Different Results Each Time Executed


From   "Sarah Edgington" <[email protected]>
To   <[email protected]>
Subject   st: RE: Macro Producing Different Results Each Time Executed
Date   Thu, 22 Aug 2013 13:55:39 -0700

You're right, this very likely a sorting problem.
If PRVDR_NUM is not a unique identifier then it is entirely possible for the
sort to be different each time.  
You can use the stable option on sort.  The problem with that is that if the
sort order for the original files changes you may still get inconsistent
results.
I think the best strategy with issues like this is to figure out why
PRVDR_NUM isn't a unique identifier and then identify exactly what rule you
want to use to choose which record you want to keep for each PRVDR_NUM.

One way to assure that you always get the same results is to always sort by
a combination of variables that you know uniquely observations.  Then you
will always identify the same first observation within a provider.  However,
while this should get you the SAME results every time you run the code, it
will not necessarily get you the RIGHT results.   If your results differ
each time you run this, that suggests that what you keep and what your drop
actually matters for your results.  So while PRVDR_NUM may be duplicated
across multiple observations, it sounds like the other variables you're
interested in actually vary across those observations.  You need to figure
out exactly which observations you want to retain and make sure your code
retains those observations to be sure that your results are both consistent
and correct.

-Sarah


-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of William Sankey
Sent: Thursday, August 22, 2013 1:22 PM
To: [email protected]
Subject: st: Macro Producing Different Results Each Time Executed

Dear Statalist,

The following 'foreach' statement produces different results each time it
executes. My suspicion is that the sort does not necessarily happen in the
same order each time the program runs, hence what is dropped and what is
kept in the merge becomes different each time.

Can the sort execute differently each time it is run, do you have other
thoughts on why I might be obtaining different results each time this
function is executed?

Thanks in advance,

foreach file in col1 col2  col4 col5   {

use `file' , clear

sort PRVDR_NUM
by PRVDR_NUM: gen keeper = 1 if _n==1
keep if keeper==1
drop keeper

sort PRVDR_NUM
save myusing3, replace

use mycostreports, clear

by PRVDR_NUM: generate unique=1 if _N==1 | (_N==2 &  psych_type=="Hosp")
keep if unique==1 drop unique

sort PRVDR_NUM
merge PRVDR_NUM using myusing3
tab _merge
drop if _merge==1
drop _merge

*Tables:

egen paytotal = total(ProviderPayments) if REG_G==9 egen paytotal_b =
total(ProviderPay_base) if REG_G==9 gen change53 =
(paytotal-paytotal_b)/paytotal_b if REG_G==9 drop paytotal*

*Executed for other changes*

sum change1 - change53
}

--
William J. Sankey
Johns Hopkins University
MA Public Policy '12
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index