    # Re: st: qvf command for count data

 From "Austin Nichols" To statalist@hsphsun2.harvard.edu Subject Re: st: qvf command for count data Date Fri, 10 Mar 2006 15:06:05 -0500

```On 3/10/06, Hugh Colaco <Hugh.Colaco@business.uconn.edu> wrote:
> qvf  y x1 x2 x3 x4 (z1 x3 x4), family(nbinomial) robust cluster (A);
> ivreg  y x1 x2 x3 x4 (z1 = x3 x4), robust cluster (A);

You seem to be misspecifying both -ivreg- and -qvf- calls at a very
basic level--which variables are included and excluded instruments?
Do you mean z1 to be an excluded instrument for two endogenous
variables x3 and x4?  If so, your equation is not identified.  Note
your -ivreg- syntax is regressing y on x1 and x2 and z1 (where z1 is
instrumented by x3 and x4) though I don't think it will run exactly as
written:

. net from http://www.stata-journal.com/software/sj3-4
. net inst st0049
. clear
. set obs 1000
. gen x1 = uniform()
. gen x2 = uniform()
. gen x3 = uniform()
. gen err = invnorm(uniform())
. gen y = 1+2*x1+3*x2+4*x3+err
. gen x4 = uniform()
. gen t3 = .8*x3 + .6*invnorm(uniform())
. gen z1=t3
. ivreg  y x1 x2 x3 x4 (z1 = x3 x4)
equation not identified; must have at least as many instruments not in
the regression as there are instrumented variables
r(481);

.  qvf y x1 x2 x3 x4 (x1 x2 x4 t3)
IV Generalized linear models                       No. of obs      =      1000
Optimization     : MQL Fisher scoring              Residual df     =       995
(IRLS EIM)                      Scale param     =  2.137276
Deviance         =  2126.589444                    (1/df) Deviance =  2.137276
Pearson          =   2126.58962                    (1/df) Pearson  =  2.137276
Variance Function: V(u) = 1                        [Gaussian]
Link Function    : g(u) = u                        [Identity]
Standard Errors  : OIM Sandwich
------------------------------------------------------------------------------
y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
x1 |   1.914558    .106234    18.02   0.000     1.706343    2.122773
x2 |   2.912829   .1086845    26.80   0.000     2.699811    3.125846
x4 |   .2775132   .1095646     2.53   0.011     .0627706    .4922558
x3 |   4.106679   .3455157    11.89   0.000     3.429481    4.783877
_cons |     .93817    .193988     4.84   0.000     .5579605     1.31838
------------------------------------------------------------------------------

Try using -ivreg2- instead. It's got good first-stage diagnostics, and
the fact that your endogenous variable is a count variable does not
imply the standard IV estimator is not consistent--just that you lose
a tiny bit of efficiency by disregarding that fact.  Note that many of
the classic RHS endogenous variables are counts, e.g. educational
attainment, and most researchers would use -ivreg2- on these models.
. ssc install ivreg2
. ivreg2  y x1 x2 x4 (x3=z1), ffirst

Summary results for first-stage regressions
-------------------------------------------
Shea
Variable    | Partial R2    |    Partial R2    F(  1,   995)    P-value
x3          |   0.1009      |      0.1009         111.65         0.0000

Underidentification tests:
Chi-sq(1)      P-value
Anderson canon. corr. likelihood ratio stat.      106.35         0.0000
Cragg-Donald N*minEval stat.                      112.21         0.0000
Ho: matrix of reduced form coefficients has rank=K-1 (underidentified)
Ha: matrix has rank>=K (identified)

Weak identification statistics:
Cragg-Donald (N-L)*minEval/L2 F-stat     111.65

Anderson-Rubin test of joint significance of
endogenous regressors B1 in main equation, Ho:B1=0
F(1,995)=      67.79     P-val=0.0000
Chi-sq(1)=     68.13     P-val=0.0000

Number of observations N           =       1000
Number of regressors   K           =          5
Number of instruments  L           =          5
Number of excluded instruments L2  =          1

Instrumental variables (2SLS) regression
----------------------------------------
Number of obs =     1000
F(  4,   995) =   292.31
Prob > F      =   0.0000
Total (centered) SS     =  3276.986562                Centered R2   =   0.7013
Total (uncentered) SS   =  33323.59494                Uncentered R2 =   0.9706
Residual SS             =  978.9706817                Root MSE      =    .9894
------------------------------------------------------------------------------
y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
x3 |   4.106679    .337572    12.17   0.000      3.44505    4.768308
x1 |   1.914558   .1075383    17.80   0.000     1.703787    2.125329
x2 |   2.912829   .1075683    27.08   0.000     2.701999    3.123658
x4 |   .2775132   .1073605     2.58   0.010     .0670905    .4879358
_cons |     .93817   .1888342     4.97   0.000     .5680617    1.308278
------------------------------------------------------------------------------
Anderson canon. corr. LR statistic (identification/IV relevance test): 106.350
Chi-sq(1) P-val =    0.0000
------------------------------------------------------------------------------
Sargan statistic (overidentification test of all instruments):           0.000
(equation exactly identified)
------------------------------------------------------------------------------
Instrumented:         x3
Included instruments: x1 x2 x4
Excluded instruments: z1
------------------------------------------------------------------------------

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```