Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: RE: Loosen precision for ttest


From   Ryan Turner <[email protected]>
To   [email protected]
Subject   Re: st: RE: Loosen precision for ttest
Date   Fri, 6 Sep 2013 01:48:52 -0400

Joe,

Thanks for your message and I apologize for my delayed response.  I suspect you are right.  I thought about perturbing my dummies {dummy ± uniformly distributed epsilon} to generate a pscore that continuously varies (but still has 6 strata if rounded).  In fact I think this would work, but feels like quite a hack.

1. Certainly; I am in the pscore specification search phase and often cannot get more robust specifications to balance.  I'd like to explore the full space, and feel frustrated this basic case cannot execute.

2. I tried psmatch2 and I think it will work; it does not get hung up on this silly precision issue.  Thanks for the suggestion.

3. This did not work.  I think it still comes down to doing this ttest on a swath of all the same pscores, which as noted are found to be statistically different due to floating point error.

Thanks again,
Ryan

On Sep 3, 2013, at 8:54 PM, Joe Canner <[email protected]> wrote:

> Ryan,
> 
> Perhaps someone else can explain why Stata is doing this and whether it can be fixed (I suspect not), but in the meantime I would suggest one of the following:
> 
> 1. Re-consider whether you should be using propensity scores if you have many strata with the same score.  Perhaps some other matching method (e.g., coursened exact matching using -cem-) would work better and perhaps be more appropriate.  Without knowing the number and nature of variables you are matching on it is difficult to give more specific advice.
> 
> 2. Consider using -psmatch2- instead of -pscore-.  -psmatch2- uses similar methods to generate the score, but different methods to check for balance.  It also does the matching as part of the main run rather than delegating it to another program as -pscore- does.
> 
> 3. If -pscore- is your only option, you might be able to trick it into behaving by using the -numblo()- option to specify a different starting number of blocks.
> 
> Regards,
> Joe Canner
> Johns Hopkins University School of Medicine
> ________________________________________
> From: [email protected] [[email protected]] on behalf of Ryan Turner [[email protected]]
> Sent: Tuesday, September 03, 2013 7:21 PM
> To: [email protected]
> Subject: st: Loosen precision for ttest
> 
> Dear Statalist:
> 
> I have a precision question; I know much has been written about it (http://blog.stata.com/tag/precision/) but as yet I am not able to resolve this issue.
> 
> I am using -pscore- to perform propensity score matching, and in the first step it checks that the estimated propensity score is equal across treatment and non-treatment groups using -ttest-.  My propensity score specification is based on dummies, resulting in several swaths of the same estimated pscore.  -pscore- should calculate the same number of blocks as the number of unique pscores ('swaths').  But instead, because of this ttest below, it splits the block and retests the same observations, ad infinitum.
> 
> In words, the algorithm determines pscore to be non-constant in the last block, where 'non-constant' means 'differs by floating point error'.  Therefore, I believe I need to 'loosen' the precision in this ttest.
> 
> Any thoughts are appreciated.  This issue only appears under certain specifications (which, unfortunately, are the ones I believe I want), and not in all blocks with a constant pscore.
> 
> Best,
> Ryan
> 
> 
> ******************************************************
> Step 1: Identification of the optimal number of blocks
> Use option detail if you want more detailed output
> ******************************************************
> 
> (79 blocks, mostly empty, omitted)
> 
> Test in block 80
> 
> Observations in block 80
> obs: 16453,  control: 407,  treated: 1676
> 
> 
> Test for block 80
> 
> Two-sample t test with equal variances
> ------------------------------------------------------------------------------
>   Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
> ---------+--------------------------------------------------------------------
>       0 |     407    .8046087    5.51e-18    1.11e-16    .8046087    .8046087
>       1 |    1676    .8046087           0           0    .8046087    .8046087
> ---------+--------------------------------------------------------------------
> combined |    2083    .8046087    1.52e-18    6.94e-17    .8046087    .8046087
> ---------+--------------------------------------------------------------------
>    diff |           -1.11e-16    2.71e-18               -1.16e-16   -1.06e-16
> ------------------------------------------------------------------------------
>    diff = mean(0) - mean(1)                                      t = -40.9193
> Ho: diff = 0                                     degrees of freedom =     2081
> 
>    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
> Pr(T < t) = 0.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 1.0000
> 
> 
> 
> The mean propensity score is different for
> treated and controls in block 80
> Split the block 80 and retest
> 
> 
> 
> 
> 
> For completeness I include a successful ttest from a different specification, where the mean is a rather boring 0 and all p-values rejecting the null are missing (.).  So this obviously doesn't have the floating point precision issue.
> 
> Two-sample t test with equal variances
> ------------------------------------------------------------------------------
>   Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
> ---------+--------------------------------------------------------------------
>       0 |     444           0           0           0           0           0
>       1 |    5829           0           0           0           0           0
> ---------+--------------------------------------------------------------------
> combined |    6273           0           0           0           0           0
> ---------+--------------------------------------------------------------------
>    diff |                   0           0                       0           0
> ------------------------------------------------------------------------------
>    diff = mean(0) - mean(1)                                      t =        .
> Ho: diff = 0                                     degrees of freedom =     6271
> 
>    Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0
> Pr(T < t) =      .         Pr(|T| > |t|) =      .          Pr(T > t) =      .
> 
> 
> 
> 
> --
> Ryan J. Turner
> Engineering and Public Policy
> Carnegie Mellon University
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/

--
Ryan J. Turner <[email protected]>
Engineering and Public Policy
Carnegie Mellon University
+1-412-304-5014 (C) | +1-484-483-3244 (GV)


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index