Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down at the end of May, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Austin Nichols <austinnichols@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: interpretation reciprocal causation ivprobit cdsimeq |

Date |
Tue, 30 Oct 2012 10:07:05 -0400 |

I would guess part of any causal explanation for the lack of response includes the long-winded presentation of the problem and the numerous places in which individual sentences do not parse, either of which might deter some readers. For example: " The Wald test of exogeneity for IVprobit estimations allows to reject the null hypothesis of exogeneity of the instruments (chi2=7.45, p-value= 0.0064). " is not proper English, does not seem to be part of a well-formed question, and incorrectly characterizes what the Wald test of exogeneity means. For my part, I would suspect that 144 obs is not enough to get good results from any kind of IV procedure in most data. Furthermore, -probit- type IV procedures rely on strong distributional assumptions and will be inconsistent in the presence of any number of violations of assumptions, e.g. heteroskedasticity, whereas linear IV with robust SE (the linear probability model version) requires weaker assumptions. At a minimum, I would like to see the output you get from -ivreg2- (SSC) applied to your model. But before any of that, you should explain the exclusion restrictions: an IV model is only as good as the story you have for why "q3_individual farms" and "q4_totaluaa_sq" have an impact on "grossmargin" but no direct impact on "insurance2011" (except through their impact on grossmargin). If we do not know what these variables are, it is hard to assess the scientific model that generates your statistical model. You might also describe the data. Those 2 sentences of introduction will make it much easier for people to answer your question about any subsequent Stata output: I have data that .... My model is that ... affects ... but not .... Stata's official -ivprobit- gives an estimate of .... The user-written -cdsimeq- described at http://www.stata-journal.com/sjpdf.html?articlenum=st0038 gives an estimate of .... In general, an IV strategy to estimate the effect of X on Y does not "care" whether simultaneous equations or omitted variables are the source of the endogeneity of X. If you have instruments Z that affect Y only via X, then you can estimate the causal impact of X on Y in either case. It is your scientific model that will tell you the source of endogeneity, not your statistical model. Finally, I have never used -cdsimeq- before, but I note that it cannot estimate the -ivprobit- model you want to compare to: webuse laborsup ivprobit fem_work fem_educ kids (other_inc = male_educ), twostep cdsimeq (other_inc male_educ fem_educ kids) (fem_work fem_educ kids) ivreg2 fem_work fem_educ kids (other_inc = male_educ), first r (noting in passing that this example from -ivprobit- relies on the absurd assumption that "kids" is exogenous). On Tue, Oct 30, 2012 at 5:56 AM, Nick Cox <njcoxstata@gmail.com> wrote: > You don't give full references, as requested in the Statalist FAQ. > > That detail of etiquette, however, doesn't explain why you got no > answer. In this case, they look like standard references that anyone > acquainted with your field would recognise (nevertheless, you are > still asked to give full references). > > As I posted yesterday, there are about 5000 members of Statalist, and > simply but importantly I can't speak for anyone else. What follows is > a personal guess. Most of those 5000 people don't post anything, and > that's great, because otherwise the list would collapse. It's like my > relationship with my newspaper: I read what looks interesting or > useful to me, ignore most of it, and feel no obligation to write to > it. > > First off, this is an intensely econometric question. That cuts down > the number of people interested and competent to say anything at all, > and cuts me out, for example. > > Questions broadly like yours are quite common on Statalist. They are > certainly allowed. But in practice they are often unanswered. > > My impression is that you do a very good job of explaining what you > are trying, but the root of it is that you want advice on correctness > of conclusions and interpretation of results. In essence, that's a > pretty tough call for anyone; even people working on similar or > identical problems would have difficulty giving an answer that is > concise, precise and helpful. > > It's difficult to know whether a question will be answered. Sometimes > a poster hits the jackpot: someone on the list knows the same problem > and say something useful. Sometimes not. > > A study of the archives -- look at thread indexes such as > <http://www.stata.com/statalist/archive/2012-10/index.html> -- will > show many good questions that went unanswered. > > In short, I don't think there is an obvious way of making your > question better. It's just a difficult question to answer and no-one > so far has felt moved to respond. > > Beyond the FAQ there's generic advice at > > <http://www.stata.com/statalist/archive/2012-10/msg00174.html> > > <http://blog.stata.com/2010/12/14/how-to-successfully-ask-a-question-on-statalist/> > > Nick > > On Tue, Oct 30, 2012 at 8:26 AM, <Marianne.LEFEBVRE@ec.europa.eu> wrote: > >> As I am new here, I would like to understand how to improve my question sent ten days ago in order to get your feedback. It is about the interpretation of the results of two estimations procedures ivprobit. Do not hesitate to let me know if this is not the good place to ask such questions or good question format. > > Marianne.LEFEBVRE@ec.europa.eu > >> I have run the following regressions using ivprobit and cdsimeq and I am not too sure about the interpretation. please see my question in capital letters below. Thanks a lot for your help. >> >> >> In order to account for the potential endogeneity between insurance decision (binary variable) and economic performance (continuous), we adopt a 2SLS estimation technique where total gross margin is instrumented. We use Newey's (1987) minimum-chi-squared estimator (ivprobit twostep option). We find that economic performance, as defined by the total gross margin, significantly explains insurance adoption (table 1). Post-estimation tests: We ran the joint significance test of the instruments in the first stage regression (F-statistic>10). The Amemiya-Lee-Newey test of overidentifying restrictions is not significant (chi2=2.025, p-value= 0.1547). The Wald test of exogeneity for IVprobit estimations allows to reject the null hypothesis of exogeneity of the instruments (chi2=7.45, p-value= 0.0064). >> >> Then, we verify whether there is reciprocal causation between insurance use and economic performance (total gross margin). To obtain this result, we rely on the two-stage probit least squares estimation method described in (Maddala 1983) for simultaneous equations models in which one of the endogenous variables is continuous (total gross margin) and the other endogenous variable is dichotomous (insurance use) (cdsimeq command in Stata http://www.stata-journal.com/article.html?article=st0038). We find that economic performance (total gross margin) significantly explains insurance adoption but the reverse effect is not significant (table 2). >> >> IS IT CORRECT TO CONCLUDE AS FOLLOWS? >> The result suggests that the endogeneity bias between insurance decision and economic performance is due to omitted variables, and not reciprocal causation. It therefore justifies the use of the ivprobit model where economic performance is instrumented to explain insurance decision, rather than the (Maddala 1983) estimation procedure (cdsimeq). >> >> Table 1: 2SLS Probability to adopt insurance, with instrumentation of gross margin >> >> First step >> Number of obs = 144 >> R-squared = 0.2453 >> Adj R-squared = 0.1946 >> >> grossmargin Coef. Std. Err. t P>t [95% Conf. Interval] >> >> q3_individual farms -228188.2*** 112400.5 -2.03 0.044 -450496.7 -5879.677 >> q4_totaluaa_sq .1313854 *** .0329142 3.99 0.000 .0662869 .1964839 >> nuts2_32 51374.23 95259.15 0.54 0.591 -137031.8 239780.2 >> nuts2_33 16731.56 100579.9 0.17 0.868 -182198 215661.2 >> nuts2_34 10620.46 100359.5 0.11 0.916 -187873.1 209114 >> nuts2_41 7942.025 127155.7 0.06 0.950 -243549.8 259433.8 >> nuts2_42 93651.27 99561.3 0.94 0.349 -103263.6 290566.2 >> q4_ratiorent -33656.87 82139.63 -0.41 0.683 -196114.8 128801 >> q21_noninsuranmeasures -56509.8 68454.26 -0.83 0.411 -191900.4 78880.8 >> _cons 292010.1 154708.1 1.89 0.061 -13975.68 597995.8 >> >> Second step >> Number of obs = 144 >> Wald chi2(8) = 29.93 >> Prob > chi2 = 0.0002 >> >> insurance2011 Coef. Std. Err. z P>z [95% Conf. Interval] >> >> I_grossmargin 3.88e-06*** 1.43e-06 2.72 0.006 1.09e-06 6.68e-06 >> nuts2_32 -1.805997 .5538496 -3.26 0.001 -2.891522 -.7204715 >> nuts2_33 -1.224679 .5420211 -2.26 0.024 -2.287021 -.1623367 >> nuts2_34 -.9044687 .5287984 -1.71 0.087 -1.940894 .131957 >> nuts2_41 -2.162879 .7412688 -2.92 0.004 -3.615739 -.7100187 >> nuts2_42 -3.11869 .7796749 -4.00 0.000 -4.646824 -1.590555 >> q4_ratiorent 1.127435 .4558475 2.47 0.013 .2339906 2.02088 >> q21_noninsuranmeasures -.7945278 .3990442 -1.99 0.046 -1.57664 -.0124156 >> _cons .4889867 .4838786 1.01 0.312 -.459398 1.437371 >> >> Wald test of exogeneity: chi2(1) = 7.45 Prob > chi2 = 0.0064 >> Test of overidentifying restrictions: Amemiya-Lee-Newey minimum chi-sq statistic Chi-sq(1)= 2.025 P-value = 0.1547 >> >> >> >> Table 2: two-stage probit least squares estimation (cdsimeq) – >> SECOND STAGE REGRESSIONS WITH CORRECTED STANDARD ERRORS >> >> >> ------------------------------------------------------------------------------ >> grossmargin | Coef. Std. Err. t P>|t| [95% Conf. Interval] >> -------------+---------------------------------------------------------------- >> I_insur~2011 | -15809.74 28736.81 -0.55 0.583 -72623.96 41004.48 >> q3_individ~s | -239284.4 113697.5 -2.10 0.037 -464070.5 -14498.42 >> q4_totalua~q | .1351443 .0322533 4.19 0.000 .0713778 .1989108 >> _cons | 264916.4 104230.9 2.54 0.012 58846.34 470986.5 >> ------------------------------------------------------------------------------ >> ------------------------------------------------------------------------------ >> insuran~2011 | Coef. Std. Err. z P>|z| [95% Conf. Interval] >> -------------+---------------------------------------------------------------- >> I_grossmar~n | 2.67e-06*** 1.07e-06 2.50 0.012 5.77e-07 4.77e-06 >> nuts2_32 | -1.634341 .5218917 -3.13 0.002 -2.65723 -.6114523 >> nuts2_33 | -1.201044 .5233067 -2.30 0.022 -2.226706 -.1753817 >> nuts2_34 | -.8801489 .5140605 -1.71 0.087 -1.887689 .1273911 >> nuts2_41 | -2.143736 .7280761 -2.94 0.003 -3.570739 -.7167331 >> nuts2_42 | -2.832533 .6944463 -4.08 0.000 -4.193622 -1.471443 >> q4_ratiorent | 1.11448 .445112 2.50 0.012 .2420765 1.986884 >> q21_nonins~s | -.755226 .381161 -1.98 0.048 -1.502288 -.0081642 >> _cons | .483358 .4681395 1.03 0.302 -.4341785 1.400894 >> ------------------------------------------------------------------------------ > * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**References**:**st: interpretation reciprocal causation ivprobit cdsimeq***From:*<Marianne.LEFEBVRE@ec.europa.eu>

**Re: st: interpretation reciprocal causation ivprobit cdsimeq***From:*Nick Cox <njcoxstata@gmail.com>

- Prev by Date:
**Re: st: Concentration Index for Binary Health Variable** - Next by Date:
**st: -save- a varlist** - Previous by thread:
**Re: st: interpretation reciprocal causation ivprobit cdsimeq** - Next by thread:
**st: most requested *simple* features to help SPSS users transition** - Index(es):