Hi everybody. Some of you may recall the discussion relating to
ivreset, numerical accuracy in detecting collinearity, and the
contributions of Arne Risa Hole (below) and Jeff Pitblado in
http://www.stata.com/statalist/archive/2006-04/msg00455.html
http://www.stata.com/statalist/archive/2006-04/msg00462.html
Thanks to a suggestion of Kit Baum, ivreset has been revised so the
correct test statistic is reported with higher order polynomials and
collinearity is less of a problem. In case you're curious, the partial
solution is to orthogonalize the polynomial terms before running the
artificial regression that generates the test statistic. It's a partial
solution because collinearity between the polynomial terms is still a
problem, but it's a problem less often than it is before
orthogonalizing.
The new version of ivreset is available from SSC, with thanks as usual
to Kit Baum.
Cheers,
Mark
Prof. Mark E. Schaffer
Director
Centre for Economic Reform and Transformation
Department of Economics
School of Management & Languages
Heriot-Watt University
Edinburgh EH14 4AS UK
44-131-451-3494 direct
44-131-451-3296 fax
http://www.sml.hw.ac.uk/cert
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of
> Arne Risa Hole
> Sent: Monday, March 20, 2006 6:08 PM
> To: [email protected]
> Subject: Re: st: ivreset
>
> Mark,
>
> Note, however, that it is only because of the collinearity
> problem - the fact that one variable is dropped from the
> model when the forecast is not rescaled - that the test
> statistics differ in your OLS example.
> The rescaling does not matter in a test with only the squared
> and cubed forecasts for example.
>
> This is different in the case of IV: even when collinearity
> is not a problem the test statistics differ, as the example
> of running Austin's code with and without the rescaling shows.
>
> The solution is probably to rescale after creating the
> polynomials as you suggest - this way of rescaling the
> forecasts does not affect the test statistic in either case,
> IV or OLS, and may solve the numerical problems that using
> the actual forcast might introduce - on the other hand it
> doesn't seem to sort out the collinearity so this problem remains..
>
> Cheers,
> Arne
>
>
>
>
> On 20/03/06, Schaffer, Mark E <[email protected]> wrote:
> > Arne,
> >
> > I think you are onto something here. The rescaling should perhaps
> > take place *after* creating the polynomials, not before.
> >
> > That said, Stata's official -ovtest- also rescales first,
> then creates
> > the polynomials. When I wrote -ivreset-, I had used replication of
> > the output of -ovtest- as a check, and I used the same approach to
> > rescaling yhat, namely rescale and then create polynomials (rather
> > than create polynomials and then rescale).
> >
> > Here is an example that demonstrates that (a) rescaling
> after creating
> > the polynomials leaves the reset statistic unchanged, and (b)
> > rescaling before creating the polynomials replicates the output of
> > official -ovtest-. Note that there are collinearity
> problems with (a)
> > even after rescaling, which makes me hesitate...
> >
> > Does anyone else want to comment?
> >
> > --Mark
> >
> > *********** do code ***************
> >
> > use http://fmwww.bc.edu/ec-p/data/hayashi/griliches76.dta, clear
> > capture drop yhat* qui regress lw s qui predict double yhat
> > * yhatr=rescaled yhat
> > sum yhat, meanonly
> > qui gen double yhatr = (yhat-r(min))/(r(max)-r(min)) qui gen double
> > yhat2=yhat^2 qui gen double yhat3=yhat^3 qui gen double
> yhat4=yhat^4
> > qui gen double yhatr2=yhatr^2 qui gen double yhatr3=yhatr^3 qui gen
> > double yhatr4=yhatr^4
> > * yhatrr2=rescaled yhat2
> > * yhatrr3=rescaled yhat3
> > * yhatrr4=rescaled yhat4
> > sum yhat2, meanonly
> > qui gen double yhatrr2 = (yhat2-r(min))/(r(max)-r(min)) sum yhat3,
> > meanonly qui gen double yhatrr3 =
> (yhat3-r(min))/(r(max)-r(min)) sum
> > yhat4, meanonly qui gen double yhatrr4 =
> > (yhat4-r(min))/(r(max)-r(min)) sum yhat*
> >
> > * Unrescaled RESET
> > qui regress lw s yhat2 yhat3 yhat4
> > testparm yhat*
> > * yhat that is first ^2, ^3, ^4, then rescaled
> > * Output identical to unrescaled RESET qui regress lw s yhatrr2
> > yhatrr3 yhatrr4 testparm yhat*
> > * yhat that is first rescaled, then ^2, ^3, ^4
> > * Output different from unrescaled RESET qui regress lw s yhatr2
> > yhatr3 yhatr4 testparm yhat*
> > * Stata's built-in ovtest
> > * Output again different from unrescaled RESET qui regress
> lw s Ovtest
> >
> > ************* output ****************
> >
> >
> > . use http://fmwww.bc.edu/ec-p/data/hayashi/griliches76.dta, clear
> > (Wages of Very Young Men, Zvi Griliches, J.Pol.Ec. 1976)
> >
> > . capture drop yhat*
> >
> > . qui regress lw s
> >
> > . qui predict double yhat
> >
> > . * yhatr=rescaled yhat
> > . sum yhat, meanonly
> >
> > . qui gen double yhatr = (yhat-r(min))/(r(max)-r(min))
> >
> > . qui gen double yhat2=yhat^2
> >
> > . qui gen double yhat3=yhat^3
> >
> > . qui gen double yhat4=yhat^4
> >
> > . qui gen double yhatr2=yhatr^2
> >
> > . qui gen double yhatr3=yhatr^3
> >
> > . qui gen double yhatr4=yhatr^4
> >
> > . * yhatrr2=rescaled yhat2
> > . * yhatrr3=rescaled yhat3
> > . * yhatrr4=rescaled yhat4
> > . sum yhat2, meanonly
> >
> > . qui gen double yhatrr2 = (yhat2-r(min))/(r(max)-r(min))
> >
> > . sum yhat3, meanonly
> >
> > . qui gen double yhatrr3 = (yhat3-r(min))/(r(max)-r(min))
> >
> > . sum yhat4, meanonly
> >
> > . qui gen double yhatrr4 = (yhat4-r(min))/(r(max)-r(min))
> >
> > . sum yhat*
> >
> > Variable | Obs Mean Std. Dev. Min
> Max
> >
> -------------+--------------------------------------------------------
> > yhat | 758 5.686739 .2156493 5.261107
> 6.130727
> > yhatr | 758 .4894459 .2479809 0
> 1
> > yhat2 | 758 32.38544 2.473995 27.67924
> 37.58581
> > yhat3 | 758 184.7003 21.3139 145.6234
> 230.4284
> > yhat4 | 758 1054.929 163.4247 766.1405
> 1412.693
> >
> -------------+--------------------------------------------------------
> > yhatr2 | 758 .3009707 .2769761 0
> 1
> > yhatr3 | 758 .2142687 .2681644 0
> 1
> > yhatr4 | 758 .1671979 .2530434 0
> 1
> > yhatrr2 | 758 .4750582 .2497327 0
> 1
> > yhatrr3 | 758 .4607848 .2513286 0
> 1
> >
> -------------+--------------------------------------------------------
> > yhatrr4 | 758 .4466593 .252763 0
> 1
> >
> > .
> > . * Unrescaled RESET
> > . qui regress lw s yhat2 yhat3 yhat4
> >
> > . testparm yhat*
> >
> > ( 1) yhat2 = 0
> > ( 2) yhat3 = 0
> > ( 3) yhat4 = 0
> > Constraint 1 dropped
> >
> > F( 2, 754) = 0.87
> > Prob > F = 0.4191
> >
> > . * yhat that is first ^2, ^3, ^4, then rescaled . * Output
> identical
> > to unrescaled RESET . qui regress lw s yhatrr2 yhatrr3 yhatrr4
> >
> > . testparm yhat*
> >
> > ( 1) yhatrr2 = 0
> > ( 2) yhatrr3 = 0
> > ( 3) yhatrr4 = 0
> > Constraint 1 dropped
> >
> > F( 2, 754) = 0.87
> > Prob > F = 0.4191
> >
> > . * yhat that is first rescaled, then ^2, ^3, ^4 . * Output
> different
> > from unrescaled RESET . qui regress lw s yhatr2 yhatr3 yhatr4
> >
> > . testparm yhat*
> >
> > ( 1) yhatr2 = 0
> > ( 2) yhatr3 = 0
> > ( 3) yhatr4 = 0
> >
> > F( 3, 753) = 0.59
> > Prob > F = 0.6216
> >
> > . * Stata's built-in ovtest
> > . * Output again different from unrescaled RESET . qui regress lw s
> >
> > . ovtest
> >
> > Ramsey RESET test using powers of the fitted values of lw
> > Ho: model has no omitted variables
> > F(3, 753) = 0.59
> > Prob > F = 0.6216
> >
> > .
> > end of do-file
> >
> > *************************************
> >
> > > -----Original Message-----
> > > From: [email protected]
> > > [mailto:[email protected]] On Behalf
> Of Arne Risa
> > > Hole
> > > Sent: 20 March 2006 16:03
> > > To: [email protected]
> > > Subject: Re: st: ivreset
> > >
> > > Hi Mark,
> > >
> > > Thank you very much for the clarifications. The rescaling
> does have
> > > an impact on the test statistic, however; this can be seen from
> > > using Austin's code and comparing the results with and
> without the
> > > line:
> > >
> > > replace yhat = (yhat-r(min))/(r(max)-r(min))
> > >
> > > So even when the correct "optimal" forecast of yhat is used
> > > (yhat=X-hat*beta-hat), rescaling the forecast affects the result.
> > > This is not a problem in the case of -ovtest-, however, since the
> > > Reset test statistic is invariant to the rescaling in the
> OLS case.
> > >
> > > Sorry for going on about this, but it seems to me that
> since the two
> > > statistics differ the correct statistic is the one without the
> > > rescaling of the (yhat=X-hat*beta-hat) forecast (even though this
> > > may introduce numerical precision problems in some cases).
> > >
> > > Cheers,
> > > Arne
> > >
> > > On 20/03/06, Schaffer, Mark E <[email protected]> wrote:
> > > > Arne,
> > > >
> > > > > -----Original Message-----
> > > > > From: [email protected]
> > > > > [mailto:[email protected]] On Behalf
> > > Of Arne Risa
> > > > > Hole
> > > > > Sent: 18 March 2006 11:43
> > > > > To: [email protected]
> > > > > Subject: RE: st: ivreset
> > > > >
> > > > > Austin, Mark,
> > > > >
> > > > > Thank you both for your replies, rescaling the
> forecast did the
> > > > > trick (sorry for the bad formatting of my code
> before, it looked
> > > > > fine in my email programme).
> > > > >
> > > > > I understand the motivation behind the rescaling, but I'm
> > > slightly
> > > > > concerned about the fact that it produces a different
> > > test statistic
> > > > > compared to using the actual forecast. Note that this
> is not the
> > > > > case when using the Reset test following OLS - the test
> > > statistic is
> > > > > invariant to the rescaling in this case.
> > > > >
> > > > > I would think that since the two approaches (rescaling/ no
> > > > > rescaling) produce different results, the correct test
> > > statistic is
> > > > > that using the actual forecast?
> > > >
> > > > There are two different issues here. Rescaling is one, and
> > > it is, in
> > > > some sense, a side issue. The problem is that sometimes
> > > the yhat has
> > > > large-ish values, and the higher order polynomials of yhat that
> > > > are included in the artificial regression can get so
> big that the
> > > > regression fails for numerical precision reasons. Stata's
> > > own version
> > > > of the reset test, -ovtest-, also does this rescaling.
> The test
> > > > statistic is, of course, invariant in theory to the units
> > > used and hence to rescaling.
> > > >
> > > > The other issue is the one you might have missed. As
> > > Austin pointed
> > > > out, the IV version of the RESET test cannot use
> standard fitted
> > > > values that would be generated by -predict- after
> estimation using
> > > > -ivreg- or -ivreg2-. These are yhat=X*beta-hat, and
> the problem
> > > > is that X includes some endogenous regressors.
> > > >
> > > > As the help file for -ivreset- explains, there are two
> alternatives.
> > > > One is to use reduced form predictions for yhat, i.e.,
> regress y
> > > > on all the exogenous variables (including the excluded
> > > instruments) and
> > > > then use -predict-. The other is to get what Pesaran and
> > > Taylor call
> > > > the "optimal forecast" of yhat. This is not
> yhat=X*beta-hat, but
> > > > yhat=X-hat*beta-hat, where X-hat includes the reduced form
> > > predicted
> > > > values of the endogenous regressors (rather than the
> actual values).
> > > > The code that Austin kindly posted to Statalast implemented
> > > the latter.
> > > >
> > > > Cheers,
> > > > Mark
> > > >
> > > > > Cheers
> > > > > Arne
> > > > >
> > > > > On Mar 17 2006, Schaffer, Mark E wrote:
> > > > >
> > > > > > Austin, Arne,
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Austin Nichols [mailto:[email protected]]
> > > > > > > Sent: 16 March 2006 23:30
> > > > > > > To: [email protected]
> > > > > > > Subject: Re: st: ivreset
> > > > > > >
> > > > > > > -findit ivreset- then -help ivreset- when installed has a
> > > > > excellent
> > > > > > > exposition that begins:
> > > > > > >
> > > > > > > As Pagan and Hall (1983) and Pesaran and Taylor (1999)
> > > > > point out, a
> > > > > > > RESET test for an IV regression cannot use the standard
> > > > > IV predicted
> > > > > > > values X*beta-hat, because X includes endogenous
> > > > > regressors that are
> > > > > > > correlated with u.
> > > > > > >
> > > > > > > Try this code instead:
> > > > > > >
> > > > > > > use http://fmwww.bc.edu/ec-p/data/hayashi/griliches76.dta
> > > > > > > qui ivreg2 lw s expr tenure rns smsa (iq=med kww) predict
> > > > > ytilde mat
> > > > > > > b=e(b) mat li b qui regress iq s expr tenure rns smsa med
> > > > > > > kww qui predict double xh gen
> yhat=ytil-b[1,1]*iq+b[1,1]*xh
> > > > > > > sum yhat, meanonly qui replace yhat =
> > > (yhat-r(min))/(r(max)-r(min))
> > > > > > > qui gen double yhat2=yhat^2 qui ivreg2 lw s expr
> > > tenure rns smsa
> > > > > > > yhat2 (iq=med kww) test yhat2 qui ivreg2 lw s expr tenure
> > > > > > > rns
> > > > > smsa (iq=med
> > > > > > > kww) ivreset
> > > > > > >
> > > > > > > Now, as to why
> > > > > > > replace yhat = (yhat-r(min))/(r(max)-r(min)) I cannot
> > > > > tell you, but
> > > > > > > it's in ivreset.ado
> > > > > >
> > > > > > It's to rescale yhat so that when it's squared,
> cubed, etc.,
> > > > > > it doesn't get wildly out of scale relative to the other
> > > regressors.
> > > > > > This can cause problems for the regression that includes
> > > > > these terms.
> > > > > >
> > > > > > Cheers,
> > > > > > Mark
> > > > > >
> > > > > >
> > > > > > > On 16 Mar 2006 19:10:13 +0000, Arne Risa Hole wrote:
> > > > > > > > I am using ivreset to do a Pesaran-Taylor Reset test
> > > > > after ivreg2.
> > > > > > > > However, I am not able to replicate the result from
> > > > > > > > ivreset
> > > > > > > manually. For example:
> > > > > > > >
> > > > > > > > use
> http://fmwww.bc.edu/ec-p/data/hayashi/griliches76.dta
> > > > > > >
> > > > > > >
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/