Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Replicability and -imputw-


From   Roberto Ferrer <[email protected]>
To   Stata Help <[email protected]>
Subject   Re: st: Replicability and -imputw-
Date   Sun, 25 Aug 2013 23:58:49 +0100

Thank you Robert. That was precisely it.

Before posting to the list, I had presented a similar problem earlier
in the code. The solution was to handle effectively the instability of
-sort-. For this new problem I was aware of how -sort-ing works but my
mistake was not to follow fully through its implications.

I thought about it's effects on the regression and concluded that no
harm was done (correct me if I'm wrong). Likewise for the random
generating process. What I did not realize was that in the program,
the same random numbers were being applied to differently ordered
observations, ending up with different results for each run. A
subtlety, for me at least.

I now use -isid varlist, sort- in order to allow for replicability.

Thanks again.

Bests,
Roberto

On Sun, Aug 25, 2013 at 10:40 PM, Robert Picard <[email protected]> wrote:
> My guess is that our data is not fully sorted at the time you run the
> program. See:
>
> http://www.stata.com/support/faqs/programming/sorting-on-categorical-variables/
>
>
>
> On Sun, Aug 25, 2013 at 3:02 PM, Roberto Ferrer <[email protected]> wrote:
>> Hello,
>>
>> I've been using a user-written command -imputw- downloaded from
>>
>> http://fdz.iab.de/187/section.aspx/Publikation/k050719a04
>> Based on Gartner, Herman. "The Imputation of Wages Above the Contribution
>> Limit with the German IAB Employment Sample." FDZ, 2005.
>>
>> My problem is with replicability. I use -set seed- to control for the
>> randomness introduced by the command but I can't manage to obtain the
>> same results for the output variable -lnw_i-. Can anyone please point
>> to source of "uncontrolled randomness" that is affecting the results
>> by inspecting the code?
>>
>> I've double checked, using -cf-, that the data going in is the same
>> for the replication runs. The results for the regressions are the same
>> for all runs (I've checked the log files in a bash terminal (linux)
>> using the program "diff" and they are identical except for log times).
>> But the final resulting variable is not the same for any two runs.
>>
>> I copy the source below since it's not very long and the code snippet
>> I'm running.
>>
>> Thank you.
>>
>> * --------------------- User-written command
>> -------------------------------------
>> program define imputw, byable(recall)
>>
>> version 8
>> syntax varlist [if] , Cens(varlist) Grenze(varlist) [Outvar(string asis)]
>>
>>     marksample touse
>> * If no name given to the output, call it by default "lnw_i".
>>     if "`outvar'" == "" {
>> local outvar "lnw_i"
>>     }
>> * Estimate Tobit model
>> cnreg `varlist' if `touse', censored(`cens')
>> quietly {
>> * Make predictions
>> predict xb00 if `touse'  , xb
>> * Generate standardized limit for each value
>> gen alpha00=(ln(`grenze')-xb00)/_b[_se] if `touse'
>>     }
>>
>> cap gen  `outvar'=.
>> replace `outvar'=`1' if `touse'
>> * Imputation
>> replace `outvar'=xb00+_b[_se] *
>> invnorm(uniform()*(1-norm(alpha00))+norm(alpha00)) if `touse'   &
>> `cens'
>>
>> drop xb00 alpha00
>> end
>>
>> * ------------------- Code I'm using -----------------------------------
>> set seed 391829 // -imputw- uses random number generator
>> sort yearobs size_b
>> by yearobs size_b: imputw lwage frau gebjahr bild esector, cens(censored) ///
>> grenze(uplimit)
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index