Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
re: Re: st: Prop.score matching: assess significance t-value + slow kernel matching
From 
 
"Ariel Linden, DrPH" <[email protected]> 
To 
 
<[email protected]> 
Subject 
 
re: Re: st: Prop.score matching: assess significance t-value + slow kernel matching 
Date 
 
Sat, 11 Aug 2012 11:55:50 -0400 
Hi Durk,
I ran the following code on a data set I had available with 11,527 treated
and 55,941 untreated:
. psmatch2 treatment gender- pre_unk, outcome(diff_all_admit_crnt) logit
kernel kerneltype(normal) common
The code took about 20 minutes to run but provided the proper output.
----------------------------------------------------------------------------
------------
        Variable     Sample |    Treated     Controls   Difference
S.E.   T-stat
----------------------------+-----------------------------------------------
------------
diff_all_admit~t  Unmatched | .015008242  -.023006382   .038014623
.005563389     6.83
                        ATT | .015008242  -.016855461   .031863702
.006165275     5.17
----------------------------+-----------------------------------------------
------------
So I would say that you may not have sufficient memory to run your analysis,
or that you're not giving it enough time...
If that is not the problem, you may want to contact the author of -psmatch2-
(Edwin Leuven) directly for advice...
Ariel
Date: Fri, 10 Aug 2012 11:44:50 +0200
From: Durk Linzel <[email protected]>
Subject: Re: st: Prop.score matching: assess significance t-value + slow
kernel matching
Dear Ariel,
Thank you for your response.
In the meantime I have also tried -psmatch2-. It is indeed a little
more user friendly. Frustratingly enough, I have still not be able to
get results for kernel matching. Also with -psmatch2- the computer
gets 'stuck'. What can I do to prevent this? It shouldn't be
impossible to run kernel matching with 54,452 observations, should it?
My syntax is:
. psmatch2 mutuelle male married no_edu primary secondary wealth_index
urban birthregister, kernel outcome(outpatient) kerneltype(normal)
common logit
Logistic regression                               Number of obs   =
54452
                                                  LR chi2(8)      =
2276.54
                                                  Prob > chi2     =
0.0000
Log likelihood = -33110.634                       Pseudo R2       =
0.0332
-
----------------------------------------------------------------------------
--
    mutuelle |      Coef.   Std. Err.      z    P>|z|     [95% Conf.
Interval]
-
-------------+--------------------------------------------------------------
--
        male |  -.1632353   .0187863    -8.69   0.000    -.2000557
-.1264149
     married |   .4163476    .021967    18.95   0.000     .3732932
.4594021
      no_edu |  -.8109381   .1463669    -5.54   0.000    -1.097812
-.5240643
     primary |  -.6586061   .1454905    -4.53   0.000    -.9437621
-.37345
   secondary |  -.4192023   .1486974    -2.82   0.005    -.7106439
-.1277607
wealth_index |    .278155   .0076323    36.44   0.000     .2631959
.2931141
       urban |   -.903245   .0333668   -27.07   0.000    -.9686428
-.8378472
birthregis~r |   .4453641    .035968    12.38   0.000     .3748681
.5158602
       _cons |   .6483167    .148126     4.38   0.000     .3579951
.9386384
-
----------------------------------------------------------------------------
--
.
.
.
Here it gets stuck.
Thanking you in advance!
Durk Linzel & Maloe Bosch
On Thu, Aug 9, 2012 at 9:41 PM, Ariel Linden, DrPH
<[email protected]> wrote:
> Hi Durk,
>
> The simple answer here is that you should consider using -psmatch2- a
> user-written program found on ssc. This program will allow you to choose
> nearest neighbor matching and kernel matching (among several options). The
> program uses regression to estimate the treatment effect and will provide
> you with the p value already.
>
> I find this program to be a lot more user friendly an intuitive that
> -pscore-.
>
> Ariel
>
>
> Date: Wed, 8 Aug 2012 13:14:46 +0200
> From: Durk Linzel <[email protected]>
> Subject: st: Prop.score matching: assess significance t-value + slow
kernel
> matching
>
> Dear Stata users,
>
> I have been struggling with two problems related to propensity score
> matching for a long time. I could not find the answer in previous
> posts, nor in the literature.  I use Stata 12.0 for windows, 32-bit,
> revision 25 July 2011.
>
> I am doing propensity score matching, with 8 covariates, with a
> database of 54,452 observations. I have succesfully executed nearest
> neighbor matching with Stata's user-written software called -pscore-
> and the attached -attnd-. The produced results are shown below.
>
> . attnd inpatient mutuelle male married no_edu primary secondary urban
> wealth_index birthregister, pscore(mypscore) logit comsup
> ATT estimation with Nearest Neighbor Matching method
> (random draw version)
> Analytical standard errors
> - ---------------------------------------------------------
> n. treat.   n. contr.         ATT    Std. Err.          t
> - ---------------------------------------------------------
>     36874       17569       0.029        0.002     17.768
> - ---------------------------------------------------------
>
> 1)      My first question is: how can I assess the significance level of
> this result? With the t-value, I would be able to simply look up the
> significance level for a certain t-value, but I would need to know the
> degrees of freedom for the propensity score. How many degrees of
> freedom does a propensity score have? Or are there otherways within
> Stata to assess the significance of my nearest neighbor matching
> results?
>
> 2)      My second question relates to kernel matching. As a complement to
> nearest neighbor I would like to execute kernel matching. The thing
> is, that if I run kernel matching with the user written software
> -attk- (also attached to -pscore-), Stata gets stuck while 'thinking'.
> I have let it run for up to several hours, but it never produced a
> result. I have tried different combinations of default bandwidth, or
> bandwidth (0.6) or bandwidth (0.03), with Epanechnikov kernel or
> Gaussian (default).With bandwidth (0.6) and Epanechnikov kernel,  I
> managed  to get a result, but without Standard error and t-value(see
> result below). Stata suggest to use the option for bootstrapped
> standard errors, but if I run this Stata gets stuck again. What is
> going wrong? I'm sure my large number of observations require more
> running time, but is there any way I can get it to actually produce
> results and/or run quicker?
>
> Thanking you in advance!
>
> Durk Linzel
>
>
> . attk outpatient mutuelle male married no_edu primary secondary
> wealth_index urban birthregister, pscore(mypscore) logit comsup epan
> bwidth(0.6)
>
>  The program is searching for matches of each treated unit.
>  This operation may take a while.
> ATT estimation with the Kernel Matching method
> - ---------------------------------------------------------
> n. treat.   n. contr.         ATT   Std. Err.           t
> - ---------------------------------------------------------
>     36874       17578       0.068           .           .
> - ---------------------------------------------------------
> Note: Analytical standard errors cannot be computed. Use
> the bootstrap option to get bootstrapped standard errors.
>
> . attk outpatient mutuelle male married no_edu primary secondary
> wealth_index urban birthregister, pscore(mypscore) logit comsup epan
> bwidth(0.6) boot
>
>  The program is searching for matches of each treated unit.
>  This operation may take a while.
>
>
>
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/