Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: Balance for PSM
Carlos Tendilla González <[email protected]>
<[email protected]>
st: Balance for PSM
Mon, 2 Dec 2013 14:27:10 -0600
I am using Stata 13. I am doing a study about Informality and its effect on Wage. The data base contains information about employees and their work status, and also some personal characteristics (age, sex, state, civil status and others).
I have to perform the Propensity Score Matching for NN, Startification, Radius and Kernel Matching. I started doing a PS Match using psmatch2.ado, and the results I had were (also available in attached):
. pstest familiar casado hombre edad edad2 escolaridad escolar2 edadsexo, raw t(totalformal)
. probit totalformal familiar casado hombre edad edad2 escolaridad escolar2 edadsexo
. predict double ps
. psmatch2 totalformal, outcome (lsalhora) pscore(ps) ate
. pstest familiar casado hombre edad edad2 escolaridad escolar2 edadsexo, both
Unmatched | Mean %reduct | t-test
Variable Matched | Treated Control %bias |bias| | t p>|t|
familiar Unmatched | .47932 .29533 38.5 | 59.46 0.000
Matched | .47932 .48352 -0.9 97.7 | -61.65 0.000
| |
casado Unmatched | .545 .37322 35.0 | 54.35 0.000
Matched | .545 .54642 -0.3 99.2 | -55.63 0.000
| |
hombre Unmatched | .6161 .62242 -1.3 | -2.03 0.043
Matched | .6161 .62591 -2.0 -55.0 | -0.86 0.390
| |
edad Unmatched | 35.085 31.907 26.9 | 42.43 0.000
Matched | 35.085 34.781 2.6 90.4 | -38.79 0.000
| |
edad2 Unmatched | 1348.2 1179.4 19.4 | 30.44 0.000
Matched | 1348.2 1322.7 2.9 84.9 | -27.09 0.000
| |
escolaridad Unmatched | 11.209 7.9337 57.0 | 88.51 0.000
Matched | 11.209 11.156 0.9 98.4 | -95.41 0.000
| |
escolar2 Unmatched | 159.96 94.585 14.8 | 22.94 0.000
Matched | 159.96 156.09 0.9 94.1 | -27.02 0.000
| |
edadsexo Unmatched | 21.85 19.616 11.9 | 18.39 0.000
Matched | 21.85 22.111 -1.4 88.3 | -18.50 0.000
| |
I thought the results were ok, since the bias in all cases is less than 5%. But then I tried to run a Radius Matching doing the same steps I did before, but this time including radius in the command
. psmatch2 totalformal, outcome (lsalhora) pscore(ps) ate radius
The issue I had is that Stata never ended processing the command after 24 hrs. So then I tried to use pscore.ado and Stata reported that the Sample does not Satisfies the Balance condition so I have to redefine the model to achieve balance.
In conclusion I have 2 questions:
1) The first results I had with psmatch2.ado were wrong (unbalanced)?
2) If the answer is no, do I have to get a better PC to process Radius Matching with psmatch2.ado?
3) If the answer is yes, why psmatch2.ado worked without Radius and did not worked with Radius?
4) Is it possible that my sample is not good for PSM?
Thanks and regards.
Email secured by Check Point
. pstest familiar casado hombre edad edad2 escolaridad escolar2 edadsexo, raw t(totalformal)
| Mean | t-test
Variable | Treated Control %bias | t p>|t|
familiar | .47932 .29533 38.5 | 59.46 0.000
casado | .545 .37322 35.0 | 54.35 0.000
hombre | .6161 .62242 -1.3 | -2.03 0.043
edad | 35.085 31.907 26.9 | 42.43 0.000
edad2 | 1348.2 1179.4 19.4 | 30.44 0.000
escolaridad | 11.209 7.9337 57.0 | 88.51 0.000
escolar2 | 159.96 94.585 14.8 | 22.94 0.000
edadsexo | 21.85 19.616 11.9 | 18.39 0.000
Summary of the distribution of |bias|
Percentiles Smallest
1% 1.303065 1.303065
5% 1.303065 11.86045
10% 1.303065 14.83451 Obs 8
25% 13.34748 19.3693 Sum of Wgt. 8
50% 23.1483 Mean 25.59916
Largest Std. Dev. 17.63892
75% 36.72786 26.92731
90% 57.04293 34.99445 Variance 311.1315
95% 57.04293 38.46126 Skewness .4347822
99% 57.04293 57.04293 Kurtosis 2.380985
Pseudo R2 LR chi2 p>chi2 MeanB MedB
0.162 21888.56 0.000 25.6 23.1
. probit totalformal familiar casado hombre edad edad2 escolaridad escolar2 edadsexo
Iteration 0: log likelihood = -67614.237
Iteration 1: log likelihood = -56703.103
Iteration 2: log likelihood = -56669.967
Iteration 3: log likelihood = -56669.958
Iteration 4: log likelihood = -56669.958
Probit regression Number of obs = 99223
LR chi2(8) = 21888.56
Prob > chi2 = 0.0000
Log likelihood = -56669.958 Pseudo R2 = 0.1619
totalformal | Coef. Std. Err. z P>|z| [95% Conf. Interval]
familiar | .4597621 .0090672 50.71 0.000 .4419907 .4775336
casado | .1712503 .0099076 17.28 0.000 .1518318 .1906689
hombre | -.0904555 .0273549 -3.31 0.001 -.1440701 -.036841
edad | .0989216 .0023001 43.01 0.000 .0944135 .1034298
edad2 | -.001151 .0000305 -37.75 0.000 -.0012108 -.0010913
escolaridad | .1330989 .0013402 99.31 0.000 .1304721 .1357257
escolar2 | -.0012331 .0000168 -73.45 0.000 -.001266 -.0012002
edadsexo | .0063368 .0007784 8.14 0.000 .0048112 .0078624
_cons | -3.125473 .0427725 -73.07 0.000 -3.209305 -3.04164
. predict double ps
(option pr assumed; Pr(totalformal))
. psmatch2 totalformal, outcome (lsalhora) pscore(ps) ate
There are observations with identical propensity score values.
The sort order of the data could affect your results.
Make sure that the sort order is random before calling psmatch2.
Variable Sample | Treated Controls Difference S.E. T-stat
lsalhora Unmatched | 3.13651377 2.62900682 .507506949 .004243537 119.60
ATT | 3.13651377 2.88727297 .249240806 .020282834 12.29
ATU | 2.62900682 2.81696837 .18796155 . .
ATE | .223280976 . .
Note: S.E. does not take into account that the propensity score is estimated.
| psmatch2:
psmatch2: | Common
Treatment | support
assignment | On suppor | Total
Untreated | 42,034 | 42,034
Treated | 57,189 | 57,189
Total | 99,223 | 99,223
. pstest familiar casado hombre edad edad2 escolaridad escolar2 edadsexo, both
Unmatched | Mean %reduct | t-test
Variable Matched | Treated Control %bias |bias| | t p>|t|
familiar Unmatched | .47932 .29533 38.5 | 59.46 0.000
Matched | .47932 .48352 -0.9 97.7 | -61.65 0.000
| |
casado Unmatched | .545 .37322 35.0 | 54.35 0.000
Matched | .545 .54642 -0.3 99.2 | -55.63 0.000
| |
hombre Unmatched | .6161 .62242 -1.3 | -2.03 0.043
Matched | .6161 .62591 -2.0 -55.0 | -0.86 0.390
| |
edad Unmatched | 35.085 31.907 26.9 | 42.43 0.000
Matched | 35.085 34.781 2.6 90.4 | -38.79 0.000
| |
edad2 Unmatched | 1348.2 1179.4 19.4 | 30.44 0.000
Matched | 1348.2 1322.7 2.9 84.9 | -27.09 0.000
| |
escolaridad Unmatched | 11.209 7.9337 57.0 | 88.51 0.000
Matched | 11.209 11.156 0.9 98.4 | -95.41 0.000
| |
escolar2 Unmatched | 159.96 94.585 14.8 | 22.94 0.000
Matched | 159.96 156.09 0.9 94.1 | -27.02 0.000
| |
edadsexo Unmatched | 21.85 19.616 11.9 | 18.39 0.000
Matched | 21.85 22.111 -1.4 88.3 | -18.50 0.000
| |
Summary of the distribution of the abs(bias)
Percentiles Smallest
1% 1.303065 1.303065
5% 1.303065 11.86045
10% 1.303065 14.83451 Obs 8
25% 13.34748 19.3693 Sum of Wgt. 8
50% 23.1483 Mean 25.59916
Largest Std. Dev. 17.63892
75% 36.72786 26.92731
90% 57.04293 34.99445 Variance 311.1315
95% 57.04293 38.46126 Skewness .4347822
99% 57.04293 57.04293 Kurtosis 2.380985
Percentiles Smallest
1% .2885384 .2885384
5% .2885384 .8772566
10% .2885384 .8781691 Obs 8
25% .8777128 .9266033 Sum of Wgt. 8
50% 1.155967 Mean 1.484899
Largest Std. Dev. .9295
75% 2.297657 1.38533
90% 2.927983 2.020247 Variance .8639703
95% 2.927983 2.575066 Skewness .4030194
99% 2.927983 2.927983 Kurtosis 1.804256
Sample | Pseudo R2 LR chi2 p>chi2 MeanBias MedBias
Raw | 0.162 21888.56 0.000 25.6 23.1
Matched | 0.162 21892.00 0.000 1.5 1.2