Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: interpretation reciprocal causation ivprobit cdsimeq

From	Austin Nichols <[email protected]>
To	[email protected]
Subject	Re: st: interpretation reciprocal causation ivprobit cdsimeq
Date	Tue, 30 Oct 2012 10:07:05 -0400
I would guess part of any causal explanation for the lack of response
includes the long-winded presentation of the problem and the numerous
places in which individual sentences do not parse, either of which
might deter some readers. For example:
"
The Wald test of exogeneity for IVprobit estimations allows to reject
the null hypothesis of exogeneity of the instruments (chi2=7.45,
p-value= 0.0064).
 "
is not proper English, does not seem to be part of a well-formed
question, and incorrectly characterizes what the Wald test of
exogeneity means.

For my part, I would suspect that 144 obs is not enough to get good
results from any kind of IV procedure in most data. Furthermore,
-probit- type IV procedures rely on strong distributional assumptions
and will be inconsistent in the presence of any number of violations
of assumptions, e.g. heteroskedasticity, whereas linear IV with robust
SE (the linear probability model version) requires weaker assumptions.
 At a minimum, I would like to see the output you get from -ivreg2-
(SSC) applied to your model.

But before any of that, you should explain the exclusion restrictions:
an IV model is only as good as the story you have for why
"q3_individual farms" and "q4_totaluaa_sq" have an impact on
"grossmargin" but no direct impact on "insurance2011" (except through
their impact on grossmargin). If we do not know what these variables
are, it is hard to assess the scientific model that generates your
statistical model. You might also describe the data. Those 2 sentences
of introduction will make it much easier for people to answer your
question about any subsequent Stata output: I have data that .... My
model is that ... affects ...  but not .... Stata's official
-ivprobit- gives an estimate of .... The user-written -cdsimeq-
described at http://www.stata-journal.com/sjpdf.html?articlenum=st0038
gives an estimate of ....

In general, an IV strategy to estimate the effect of X on Y does not
"care" whether simultaneous equations or omitted variables are the
source of the endogeneity of X. If you have instruments Z that affect
Y only via X, then you can estimate the causal impact of X on Y in
either case.  It is your scientific model that will tell you the
source of endogeneity, not your statistical model.

Finally, I have never used -cdsimeq- before, but I note that it cannot
estimate the -ivprobit- model you want to compare to:

webuse laborsup
ivprobit fem_work fem_educ kids (other_inc = male_educ), twostep
cdsimeq (other_inc male_educ fem_educ kids) (fem_work fem_educ kids)
ivreg2 fem_work fem_educ kids (other_inc = male_educ), first r

(noting in passing that this example from -ivprobit- relies on the
absurd assumption that "kids" is exogenous).


On Tue, Oct 30, 2012 at 5:56 AM, Nick Cox <[email protected]> wrote:
> You don't give full references, as requested in the Statalist FAQ.
>
> That detail of etiquette, however, doesn't explain why you got no
> answer. In this case, they look like standard references that anyone
> acquainted with your field would recognise (nevertheless, you are
> still asked to give full references).
>
> As I posted yesterday, there are about 5000 members of Statalist, and
> simply but importantly I can't speak for anyone else. What follows is
> a personal guess. Most of those 5000 people don't post anything, and
> that's great, because otherwise the list would collapse. It's like my
> relationship with my newspaper: I read what looks interesting or
> useful to me, ignore most of it, and feel no obligation to write to
> it.
>
> First off, this is an intensely econometric question. That cuts down
> the number of people interested and competent to say anything at all,
> and cuts me out, for example.
>
> Questions broadly like yours are quite common on Statalist. They are
> certainly allowed. But in practice they are often unanswered.
>
> My impression is that you do a very good job of explaining what you
> are trying, but the root of it is that you want advice on correctness
> of conclusions and interpretation of results. In essence, that's a
> pretty tough call for anyone; even people working on similar or
> identical problems would have difficulty giving an answer that is
> concise, precise and helpful.
>
> It's difficult to know whether a question will be answered. Sometimes
> a poster hits the jackpot: someone on the list knows the same problem
> and say something useful. Sometimes not.
>
> A study of the archives -- look at thread indexes such as
> <http://www.stata.com/statalist/archive/2012-10/index.html> -- will
> show many good questions that went unanswered.
>
> In short, I don't think there is an obvious way of making your
> question better. It's just a difficult question to answer and no-one
> so far has felt moved to respond.
>
> Beyond the FAQ there's generic advice at
>
> <http://www.stata.com/statalist/archive/2012-10/msg00174.html>
>
> <http://blog.stata.com/2010/12/14/how-to-successfully-ask-a-question-on-statalist/>
>
> Nick
>
> On Tue, Oct 30, 2012 at 8:26 AM,  <[email protected]> wrote:
>
>> As I am new here, I would like to understand how to improve my question sent ten days ago in order to get your feedback. It is about the interpretation of the results of two estimations procedures ivprobit. Do not hesitate to let me know if this is not the good place to ask such questions or good question format.
>
> [email protected]
>
>> I have run the following regressions using ivprobit and cdsimeq and I am not too sure about the interpretation. please see my question in capital letters below. Thanks a lot for your help.
>>
>>
>>  In order to account for the potential endogeneity between insurance decision (binary variable) and economic performance (continuous), we adopt a 2SLS estimation technique where total gross margin is instrumented. We use Newey's (1987) minimum-chi-squared estimator (ivprobit twostep option). We find that economic performance, as defined by the total gross margin, significantly explains insurance adoption (table 1).  Post-estimation tests: We ran the joint significance test of the instruments in the first stage regression (F-statistic>10). The Amemiya-Lee-Newey test of overidentifying restrictions is not significant (chi2=2.025, p-value= 0.1547). The Wald test of exogeneity for IVprobit estimations allows to reject the null hypothesis of exogeneity of the instruments (chi2=7.45, p-value= 0.0064).
>>
>> Then, we verify whether there is reciprocal causation between insurance use and economic performance (total gross margin). To obtain this result, we rely on the two-stage probit least squares estimation method described in (Maddala 1983) for simultaneous equations models in which one of the endogenous variables is continuous (total gross margin) and the other endogenous variable is dichotomous (insurance use) (cdsimeq command in Stata http://www.stata-journal.com/article.html?article=st0038). We find that economic performance (total gross margin) significantly explains insurance adoption but the reverse effect is not significant (table 2).
>>
>> IS IT CORRECT TO CONCLUDE AS FOLLOWS?
>> The result suggests that the endogeneity bias between insurance decision and economic performance is due to omitted variables, and not reciprocal causation. It therefore justifies the use of the ivprobit model where economic performance is instrumented to explain insurance decision, rather than the (Maddala 1983)  estimation procedure (cdsimeq).
>>
>> Table 1: 2SLS Probability to adopt insurance, with instrumentation of gross margin
>>
>> First step
>> Number of obs =     144
>> R-squared     =  0.2453
>> Adj R-squared =  0.1946
>>
>> grossmargin                             Coef.   Std. Err.      t    P>t     [95% Conf. Interval]
>>
>> q3_individual farms             -228188.2***   112400.5    -2.03   0.044    -450496.7   -5879.677
>> q4_totaluaa_sq                      .1313854 ***  .0329142     3.99   0.000     .0662869    .1964839
>> nuts2_32                                51374.23   95259.15     0.54   0.591    -137031.8    239780.2
>> nuts2_33                                16731.56   100579.9     0.17   0.868      -182198    215661.2
>> nuts2_34                                10620.46   100359.5     0.11   0.916    -187873.1      209114
>> nuts2_41                                 7942.025   127155.7     0.06   0.950    -243549.8    259433.8
>> nuts2_42                                  93651.27    99561.3     0.94   0.349    -103263.6    290566.2
>> q4_ratiorent                           -33656.87   82139.63    -0.41   0.683    -196114.8      128801
>> q21_noninsuranmeasures       -56509.8   68454.26    -0.83   0.411    -191900.4     78880.8
>> _cons                                       292010.1   154708.1     1.89   0.061    -13975.68    597995.8
>>
>> Second step
>> Number of obs   =       144
>> Wald chi2(8)    =     29.93
>> Prob > chi2     =    0.0002
>>
>> insurance2011                                             Coef.   Std. Err.      z    P>z     [95% Conf. Interval]
>>
>> I_grossmargin                       3.88e-06***   1.43e-06     2.72   0.006     1.09e-06    6.68e-06
>> nuts2_32                                -1.805997   .5538496    -3.26   0.001    -2.891522   -.7204715
>> nuts2_33                                   -1.224679   .5420211    -2.26   0.024    -2.287021   -.1623367
>> nuts2_34                                   -.9044687   .5287984    -1.71   0.087    -1.940894     .131957
>> nuts2_41                                  -2.162879   .7412688    -2.92   0.004    -3.615739   -.7100187
>> nuts2_42                                   -3.11869   .7796749    -4.00   0.000    -4.646824   -1.590555
>> q4_ratiorent                               1.127435   .4558475     2.47   0.013     .2339906     2.02088
>> q21_noninsuranmeasures          -.7945278   .3990442    -1.99   0.046     -1.57664   -.0124156
>> _cons                                          .4889867   .4838786     1.01   0.312     -.459398    1.437371
>>
>> Wald test of exogeneity:     chi2(1) =     7.45           Prob > chi2 = 0.0064
>> Test of overidentifying restrictions: Amemiya-Lee-Newey minimum chi-sq statistic     Chi-sq(1)= 2.025     P-value = 0.1547
>>
>>
>>
>> Table 2: two-stage probit least squares estimation (cdsimeq) –
>> SECOND STAGE REGRESSIONS WITH CORRECTED STANDARD ERRORS
>>
>>
>> ------------------------------------------------------------------------------
>>  grossmargin |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
>> -------------+----------------------------------------------------------------
>> I_insur~2011 |  -15809.74   28736.81    -0.55   0.583    -72623.96    41004.48
>> q3_individ~s |  -239284.4   113697.5    -2.10   0.037    -464070.5   -14498.42
>> q4_totalua~q |   .1351443   .0322533     4.19   0.000     .0713778    .1989108
>>        _cons |   264916.4   104230.9     2.54   0.012     58846.34    470986.5
>> ------------------------------------------------------------------------------
>> ------------------------------------------------------------------------------
>> insuran~2011 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
>> -------------+----------------------------------------------------------------
>> I_grossmar~n |   2.67e-06***   1.07e-06     2.50   0.012     5.77e-07    4.77e-06
>>     nuts2_32 |  -1.634341   .5218917    -3.13   0.002     -2.65723   -.6114523
>>     nuts2_33 |  -1.201044   .5233067    -2.30   0.022    -2.226706   -.1753817
>>     nuts2_34 |  -.8801489   .5140605    -1.71   0.087    -1.887689    .1273911
>>     nuts2_41 |  -2.143736   .7280761    -2.94   0.003    -3.570739   -.7167331
>>     nuts2_42 |  -2.832533   .6944463    -4.08   0.000    -4.193622   -1.471443
>> q4_ratiorent |    1.11448    .445112     2.50   0.012     .2420765    1.986884
>> q21_nonins~s |   -.755226    .381161    -1.98   0.048    -1.502288   -.0081642
>>        _cons |    .483358   .4681395     1.03   0.302    -.4341785    1.400894
>> ------------------------------------------------------------------------------
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
References:
- st: interpretation reciprocal causation ivprobit cdsimeq
  - From: <[email protected]>
- Re: st: interpretation reciprocal causation ivprobit cdsimeq
  - From: Nick Cox <[email protected]>
Prev by Date: Re: st: Concentration Index for Binary Health Variable
Next by Date: st: -save- a varlist
Previous by thread: Re: st: interpretation reciprocal causation ivprobit cdsimeq
Next by thread: st: most requested *simple* features to help SPSS users transition
Index(es):
- Date
- Thread