Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: Balance for PSM


From   Carlos Tendilla González <carlos.tendilla@imss.gob.mx>
To   <statalist@hsphsun2.harvard.edu>
Subject   st: Balance for PSM
Date   Mon, 2 Dec 2013 14:27:10 -0600

Hi,

I am using Stata 13. I am doing a study about Informality and its effect on Wage. The data base contains information about employees and their work status, and also some personal characteristics  (age, sex, state, civil status and others).

I have to perform the Propensity Score Matching for NN, Startification, Radius and Kernel Matching. I started doing a PS Match using psmatch2.ado, and the results I had were (also available in attached):

. pstest familiar casado hombre edad edad2 escolaridad escolar2 edadsexo, raw t(totalformal)
. probit totalformal familiar casado hombre edad edad2 escolaridad escolar2 edadsexo
. predict double ps
. psmatch2 totalformal, outcome (lsalhora) pscore(ps) ate
. pstest familiar casado hombre edad edad2 escolaridad escolar2 edadsexo, both

------------------------------------------------------------------------------
                Unmatched |       Mean               %reduct |     t-test
    Variable      Matched | Treated Control    %bias  |bias| |    t    p>|t|
--------------------------+----------------------------------+----------------
     familiar   Unmatched | .47932   .29533     38.5         |  59.46  0.000
                 Matched  | .47932   .48352     -0.9    97.7 | -61.65  0.000
                          |                                  |
       casado   Unmatched |   .545   .37322     35.0         |  54.35  0.000
                 Matched  |   .545   .54642     -0.3    99.2 | -55.63  0.000
                          |                                  |
       hombre   Unmatched |  .6161   .62242     -1.3         |  -2.03  0.043
                 Matched  |  .6161   .62591     -2.0   -55.0 |  -0.86  0.390
                          |                                  |
         edad   Unmatched | 35.085   31.907     26.9         |  42.43  0.000
                 Matched  | 35.085   34.781      2.6    90.4 | -38.79  0.000
                          |                                  |
        edad2   Unmatched | 1348.2   1179.4     19.4         |  30.44  0.000
                 Matched  | 1348.2   1322.7      2.9    84.9 | -27.09  0.000
                          |                                  |
  escolaridad   Unmatched | 11.209   7.9337     57.0         |  88.51  0.000
                 Matched  | 11.209   11.156      0.9    98.4 | -95.41  0.000
                          |                                  |
     escolar2   Unmatched | 159.96   94.585     14.8         |  22.94  0.000
                 Matched  | 159.96   156.09      0.9    94.1 | -27.02  0.000
                          |                                  |
     edadsexo   Unmatched |  21.85   19.616     11.9         |  18.39  0.000
                 Matched  |  21.85   22.111     -1.4    88.3 | -18.50  0.000
                          |                                  |
------------------------------------------------------------------------------

I thought the results were ok, since the bias in all cases is less than 5%. But then I tried to run a Radius Matching doing the same steps I did before, but this time including radius in the command

. psmatch2 totalformal, outcome (lsalhora) pscore(ps) ate radius

The issue I had is that Stata never ended processing the command after 24 hrs. So then I tried to use pscore.ado and Stata reported that the Sample does not Satisfies the Balance condition so I have to redefine the model to achieve balance.

In conclusion I have 2 questions:

1) The first results I had with psmatch2.ado were wrong (unbalanced)?
2) If the answer is no, do I have to get a better PC to process Radius Matching with psmatch2.ado?
3) If the answer is yes, why psmatch2.ado worked without Radius and did not worked with Radius?
4) Is it possible that my sample is not good for PSM?

Thanks and regards.





Email secured by Check Point

. pstest familiar casado hombre edad edad2 escolaridad escolar2 edadsexo, raw t(totalformal)

---------------------------------------------------------
             |       Mean               |     t-test
    Variable | Treated Control    %bias |    t    p>|t|
-------------+--------------------------+----------------
familiar     | .47932   .29533     38.5 |  59.46  0.000
casado       |   .545   .37322     35.0 |  54.35  0.000
hombre       |  .6161   .62242     -1.3 |  -2.03  0.043
edad         | 35.085   31.907     26.9 |  42.43  0.000
edad2        | 1348.2   1179.4     19.4 |  30.44  0.000
escolaridad  | 11.209   7.9337     57.0 |  88.51  0.000
escolar2     | 159.96   94.585     14.8 |  22.94  0.000
edadsexo     |  21.85   19.616     11.9 |  18.39  0.000
---------------------------------------------------------
-------------------------------------------------------------

            Summary of the distribution of |bias|
-------------------------------------------------------------
      Percentiles      Smallest
 1%     1.303065       1.303065
 5%     1.303065       11.86045
10%     1.303065       14.83451       Obs                   8
25%     13.34748        19.3693       Sum of Wgt.           8

50%      23.1483                      Mean           25.59916
                        Largest       Std. Dev.      17.63892
75%     36.72786       26.92731
90%     57.04293       34.99445       Variance       311.1315
95%     57.04293       38.46126       Skewness       .4347822
99%     57.04293       57.04293       Kurtosis       2.380985
-------------------------------------------------------------

----------------------------------------------------------
Pseudo R2      LR chi2        p>chi2      MeanB     MedB
----------------------------------------------------------
    0.162     21888.56         0.000      25.6      23.1
----------------------------------------------------------

. probit totalformal familiar casado hombre edad edad2 escolaridad escolar2 edadsexo

Iteration 0:   log likelihood = -67614.237  
Iteration 1:   log likelihood = -56703.103  
Iteration 2:   log likelihood = -56669.967  
Iteration 3:   log likelihood = -56669.958  
Iteration 4:   log likelihood = -56669.958  

Probit regression                                 Number of obs   =      99223
                                                  LR chi2(8)      =   21888.56
                                                  Prob > chi2     =     0.0000
Log likelihood = -56669.958                       Pseudo R2       =     0.1619

------------------------------------------------------------------------------
 totalformal |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
    familiar |   .4597621   .0090672    50.71   0.000     .4419907    .4775336
      casado |   .1712503   .0099076    17.28   0.000     .1518318    .1906689
      hombre |  -.0904555   .0273549    -3.31   0.001    -.1440701    -.036841
        edad |   .0989216   .0023001    43.01   0.000     .0944135    .1034298
       edad2 |   -.001151   .0000305   -37.75   0.000    -.0012108   -.0010913
 escolaridad |   .1330989   .0013402    99.31   0.000     .1304721    .1357257
    escolar2 |  -.0012331   .0000168   -73.45   0.000     -.001266   -.0012002
    edadsexo |   .0063368   .0007784     8.14   0.000     .0048112    .0078624
       _cons |  -3.125473   .0427725   -73.07   0.000    -3.209305    -3.04164
------------------------------------------------------------------------------

. predict double ps
(option pr assumed; Pr(totalformal))

. psmatch2 totalformal, outcome (lsalhora) pscore(ps) ate
There are observations with identical propensity score values.
The sort order of the data could affect your results.
Make sure that the sort order is random before calling psmatch2.
----------------------------------------------------------------------------------------
        Variable     Sample |    Treated     Controls   Difference         S.E.   T-stat
----------------------------+-----------------------------------------------------------
        lsalhora  Unmatched | 3.13651377   2.62900682   .507506949   .004243537   119.60
                        ATT | 3.13651377   2.88727297   .249240806   .020282834    12.29
                        ATU | 2.62900682   2.81696837    .18796155            .        .
                        ATE |                           .223280976            .        .
----------------------------+-----------------------------------------------------------
Note: S.E. does not take into account that the propensity score is estimated.

           | psmatch2:
 psmatch2: |   Common
 Treatment |  support
assignment | On suppor |     Total
-----------+-----------+----------
 Untreated |    42,034 |    42,034 
   Treated |    57,189 |    57,189 
-----------+-----------+----------
     Total |    99,223 |    99,223 


. pstest familiar casado hombre edad edad2 escolaridad escolar2 edadsexo, both

------------------------------------------------------------------------------
                Unmatched |       Mean               %reduct |     t-test
    Variable      Matched | Treated Control    %bias  |bias| |    t    p>|t|
--------------------------+----------------------------------+----------------
     familiar   Unmatched | .47932   .29533     38.5         |  59.46  0.000
                 Matched  | .47932   .48352     -0.9    97.7 | -61.65  0.000
                          |                                  |
       casado   Unmatched |   .545   .37322     35.0         |  54.35  0.000
                 Matched  |   .545   .54642     -0.3    99.2 | -55.63  0.000
                          |                                  |
       hombre   Unmatched |  .6161   .62242     -1.3         |  -2.03  0.043
                 Matched  |  .6161   .62591     -2.0   -55.0 |  -0.86  0.390
                          |                                  |
         edad   Unmatched | 35.085   31.907     26.9         |  42.43  0.000
                 Matched  | 35.085   34.781      2.6    90.4 | -38.79  0.000
                          |                                  |
        edad2   Unmatched | 1348.2   1179.4     19.4         |  30.44  0.000
                 Matched  | 1348.2   1322.7      2.9    84.9 | -27.09  0.000
                          |                                  |
  escolaridad   Unmatched | 11.209   7.9337     57.0         |  88.51  0.000
                 Matched  | 11.209   11.156      0.9    98.4 | -95.41  0.000
                          |                                  |
     escolar2   Unmatched | 159.96   94.585     14.8         |  22.94  0.000
                 Matched  | 159.96   156.09      0.9    94.1 | -27.02  0.000
                          |                                  |
     edadsexo   Unmatched |  21.85   19.616     11.9         |  18.39  0.000
                 Matched  |  21.85   22.111     -1.4    88.3 | -18.50  0.000
                          |                                  |
------------------------------------------------------------------------------
-------------------------------------------------------------
         Summary of the distribution of the abs(bias)
-------------------------------------------------------------

                       BEFORE MATCHING
-------------------------------------------------------------
      Percentiles      Smallest
 1%     1.303065       1.303065
 5%     1.303065       11.86045
10%     1.303065       14.83451       Obs                   8
25%     13.34748        19.3693       Sum of Wgt.           8

50%      23.1483                      Mean           25.59916
                        Largest       Std. Dev.      17.63892
75%     36.72786       26.92731
90%     57.04293       34.99445       Variance       311.1315
95%     57.04293       38.46126       Skewness       .4347822
99%     57.04293       57.04293       Kurtosis       2.380985
-------------------------------------------------------------

                       AFTER MATCHING
-------------------------------------------------------------
      Percentiles      Smallest
 1%     .2885384       .2885384
 5%     .2885384       .8772566
10%     .2885384       .8781691       Obs                   8
25%     .8777128       .9266033       Sum of Wgt.           8

50%     1.155967                      Mean           1.484899
                        Largest       Std. Dev.         .9295
75%     2.297657        1.38533
90%     2.927983       2.020247       Variance       .8639703
95%     2.927983       2.575066       Skewness       .4030194
99%     2.927983       2.927983       Kurtosis       1.804256
-------------------------------------------------------------

-----------------------------------------------------------------
 Sample  | Pseudo R2    LR chi2    p>chi2    MeanBias    MedBias
---------+-------------------------------------------------------
 Raw     |    0.162    21888.56    0.000       25.6       23.1
 Matched |    0.162    21892.00    0.000        1.5        1.2
-----------------------------------------------------------------


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index