Mark,
Note, however, that it is only because of the collinearity problem -
the fact that one variable is dropped from the model when the forecast
is not rescaled - that the test statistics differ in your OLS example.
The rescaling does not matter in a test with only the squared and
cubed forecasts for example.
This is different in the case of IV: even when collinearity is not a
problem the test statistics differ, as the example of running Austin's
code with and without the rescaling shows.
The solution is probably to rescale after creating the polynomials as
you suggest - this way of rescaling the forecasts does not affect the
test statistic in either case, IV or OLS, and may solve the numerical
problems that using the actual forcast might introduce - on the other
hand it doesn't seem to sort out the collinearity so this problem
remains..
Cheers,
Arne
On 20/03/06, Schaffer, Mark E <M.E.Schaffer@hw.ac.uk> wrote:
> Arne,
>
> I think you are onto something here. The rescaling should perhaps take
> place *after* creating the polynomials, not before.
>
> That said, Stata's official -ovtest- also rescales first, then creates
> the polynomials. When I wrote -ivreset-, I had used replication of the
> output of -ovtest- as a check, and I used the same approach to rescaling
> yhat, namely rescale and then create polynomials (rather than create
> polynomials and then rescale).
>
> Here is an example that demonstrates that (a) rescaling after creating
> the polynomials leaves the reset statistic unchanged, and (b) rescaling
> before creating the polynomials replicates the output of official
> -ovtest-. Note that there are collinearity problems with (a) even after
> rescaling, which makes me hesitate...
>
> Does anyone else want to comment?
>
> --Mark
>
> *********** do code ***************
>
> use http://fmwww.bc.edu/ec-p/data/hayashi/griliches76.dta, clear
> capture drop yhat*
> qui regress lw s
> qui predict double yhat
> * yhatr=rescaled yhat
> sum yhat, meanonly
> qui gen double yhatr = (yhat-r(min))/(r(max)-r(min))
> qui gen double yhat2=yhat^2
> qui gen double yhat3=yhat^3
> qui gen double yhat4=yhat^4
> qui gen double yhatr2=yhatr^2
> qui gen double yhatr3=yhatr^3
> qui gen double yhatr4=yhatr^4
> * yhatrr2=rescaled yhat2
> * yhatrr3=rescaled yhat3
> * yhatrr4=rescaled yhat4
> sum yhat2, meanonly
> qui gen double yhatrr2 = (yhat2-r(min))/(r(max)-r(min))
> sum yhat3, meanonly
> qui gen double yhatrr3 = (yhat3-r(min))/(r(max)-r(min))
> sum yhat4, meanonly
> qui gen double yhatrr4 = (yhat4-r(min))/(r(max)-r(min))
> sum yhat*
>
> * Unrescaled RESET
> qui regress lw s yhat2 yhat3 yhat4
> testparm yhat*
> * yhat that is first ^2, ^3, ^4, then rescaled
> * Output identical to unrescaled RESET
> qui regress lw s yhatrr2 yhatrr3 yhatrr4
> testparm yhat*
> * yhat that is first rescaled, then ^2, ^3, ^4
> * Output different from unrescaled RESET
> qui regress lw s yhatr2 yhatr3 yhatr4
> testparm yhat*
> * Stata's built-in ovtest
> * Output again different from unrescaled RESET
> qui regress lw s
> Ovtest
>
> ************* output ****************
>
>
> . use http://fmwww.bc.edu/ec-p/data/hayashi/griliches76.dta, clear
> (Wages of Very Young Men, Zvi Griliches, J.Pol.Ec. 1976)
>
> . capture drop yhat*
>
> . qui regress lw s
>
> . qui predict double yhat
>
> . * yhatr=rescaled yhat
> . sum yhat, meanonly
>
> . qui gen double yhatr = (yhat-r(min))/(r(max)-r(min))
>
> . qui gen double yhat2=yhat^2
>
> . qui gen double yhat3=yhat^3
>
> . qui gen double yhat4=yhat^4
>
> . qui gen double yhatr2=yhatr^2
>
> . qui gen double yhatr3=yhatr^3
>
> . qui gen double yhatr4=yhatr^4
>
> . * yhatrr2=rescaled yhat2
> . * yhatrr3=rescaled yhat3
> . * yhatrr4=rescaled yhat4
> . sum yhat2, meanonly
>
> . qui gen double yhatrr2 = (yhat2-r(min))/(r(max)-r(min))
>
> . sum yhat3, meanonly
>
> . qui gen double yhatrr3 = (yhat3-r(min))/(r(max)-r(min))
>
> . sum yhat4, meanonly
>
> . qui gen double yhatrr4 = (yhat4-r(min))/(r(max)-r(min))
>
> . sum yhat*
>
> Variable | Obs Mean Std. Dev. Min Max
> -------------+--------------------------------------------------------
> yhat | 758 5.686739 .2156493 5.261107 6.130727
> yhatr | 758 .4894459 .2479809 0 1
> yhat2 | 758 32.38544 2.473995 27.67924 37.58581
> yhat3 | 758 184.7003 21.3139 145.6234 230.4284
> yhat4 | 758 1054.929 163.4247 766.1405 1412.693
> -------------+--------------------------------------------------------
> yhatr2 | 758 .3009707 .2769761 0 1
> yhatr3 | 758 .2142687 .2681644 0 1
> yhatr4 | 758 .1671979 .2530434 0 1
> yhatrr2 | 758 .4750582 .2497327 0 1
> yhatrr3 | 758 .4607848 .2513286 0 1
> -------------+--------------------------------------------------------
> yhatrr4 | 758 .4466593 .252763 0 1
>
> .
> . * Unrescaled RESET
> . qui regress lw s yhat2 yhat3 yhat4
>
> . testparm yhat*
>
> ( 1) yhat2 = 0
> ( 2) yhat3 = 0
> ( 3) yhat4 = 0
> Constraint 1 dropped
>
> F( 2, 754) = 0.87
> Prob > F = 0.4191
>
> . * yhat that is first ^2, ^3, ^4, then rescaled
> . * Output identical to unrescaled RESET
> . qui regress lw s yhatrr2 yhatrr3 yhatrr4
>
> . testparm yhat*
>
> ( 1) yhatrr2 = 0
> ( 2) yhatrr3 = 0
> ( 3) yhatrr4 = 0
> Constraint 1 dropped
>
> F( 2, 754) = 0.87
> Prob > F = 0.4191
>
> . * yhat that is first rescaled, then ^2, ^3, ^4
> . * Output different from unrescaled RESET
> . qui regress lw s yhatr2 yhatr3 yhatr4
>
> . testparm yhat*
>
> ( 1) yhatr2 = 0
> ( 2) yhatr3 = 0
> ( 3) yhatr4 = 0
>
> F( 3, 753) = 0.59
> Prob > F = 0.6216
>
> . * Stata's built-in ovtest
> . * Output again different from unrescaled RESET
> . qui regress lw s
>
> . ovtest
>
> Ramsey RESET test using powers of the fitted values of lw
> Ho: model has no omitted variables
> F(3, 753) = 0.59
> Prob > F = 0.6216
>
> .
> end of do-file
>
> *************************************
>
> > -----Original Message-----
> > From: owner-statalist@hsphsun2.harvard.edu
> > [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf Of
> > Arne Risa Hole
> > Sent: 20 March 2006 16:03
> > To: statalist@hsphsun2.harvard.edu
> > Subject: Re: st: ivreset
> >
> > Hi Mark,
> >
> > Thank you very much for the clarifications. The rescaling
> > does have an impact on the test statistic, however; this can
> > be seen from using Austin's code and comparing the results
> > with and without the line:
> >
> > replace yhat = (yhat-r(min))/(r(max)-r(min))
> >
> > So even when the correct "optimal" forecast of yhat is used
> > (yhat=X-hat*beta-hat), rescaling the forecast affects the
> > result. This is not a problem in the case of -ovtest-,
> > however, since the Reset test statistic is invariant to the
> > rescaling in the OLS case.
> >
> > Sorry for going on about this, but it seems to me that since
> > the two statistics differ the correct statistic is the one
> > without the rescaling of the (yhat=X-hat*beta-hat) forecast
> > (even though this may introduce numerical precision problems
> > in some cases).
> >
> > Cheers,
> > Arne
> >
> > On 20/03/06, Schaffer, Mark E <M.E.Schaffer@hw.ac.uk> wrote:
> > > Arne,
> > >
> > > > -----Original Message-----
> > > > From: owner-statalist@hsphsun2.harvard.edu
> > > > [mailto:owner-statalist@hsphsun2.harvard.edu] On Behalf
> > Of Arne Risa
> > > > Hole
> > > > Sent: 18 March 2006 11:43
> > > > To: statalist@hsphsun2.harvard.edu
> > > > Subject: RE: st: ivreset
> > > >
> > > > Austin, Mark,
> > > >
> > > > Thank you both for your replies, rescaling the forecast did the
> > > > trick (sorry for the bad formatting of my code before, it looked
> > > > fine in my email programme).
> > > >
> > > > I understand the motivation behind the rescaling, but I'm
> > slightly
> > > > concerned about the fact that it produces a different
> > test statistic
> > > > compared to using the actual forecast. Note that this is not the
> > > > case when using the Reset test following OLS - the test
> > statistic is
> > > > invariant to the rescaling in this case.
> > > >
> > > > I would think that since the two approaches (rescaling/ no
> > > > rescaling) produce different results, the correct test
> > statistic is
> > > > that using the actual forecast?
> > >
> > > There are two different issues here. Rescaling is one, and
> > it is, in
> > > some sense, a side issue. The problem is that sometimes
> > the yhat has
> > > large-ish values, and the higher order polynomials of yhat that are
> > > included in the artificial regression can get so big that the
> > > regression fails for numerical precision reasons. Stata's
> > own version
> > > of the reset test, -ovtest-, also does this rescaling. The test
> > > statistic is, of course, invariant in theory to the units
> > used and hence to rescaling.
> > >
> > > The other issue is the one you might have missed. As
> > Austin pointed
> > > out, the IV version of the RESET test cannot use standard fitted
> > > values that would be generated by -predict- after estimation using
> > > -ivreg- or -ivreg2-. These are yhat=X*beta-hat, and the problem is
> > > that X includes some endogenous regressors.
> > >
> > > As the help file for -ivreset- explains, there are two alternatives.
> > > One is to use reduced form predictions for yhat, i.e., regress y on
> > > all the exogenous variables (including the excluded
> > instruments) and
> > > then use -predict-. The other is to get what Pesaran and
> > Taylor call
> > > the "optimal forecast" of yhat. This is not yhat=X*beta-hat, but
> > > yhat=X-hat*beta-hat, where X-hat includes the reduced form
> > predicted
> > > values of the endogenous regressors (rather than the actual values).
> > > The code that Austin kindly posted to Statalast implemented
> > the latter.
> > >
> > > Cheers,
> > > Mark
> > >
> > > > Cheers
> > > > Arne
> > > >
> > > > On Mar 17 2006, Schaffer, Mark E wrote:
> > > >
> > > > > Austin, Arne,
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Austin Nichols [mailto:austinnichols@gmail.com]
> > > > > > Sent: 16 March 2006 23:30
> > > > > > To: statalist@hsphsun2.harvard.edu
> > > > > > Subject: Re: st: ivreset
> > > > > >
> > > > > > -findit ivreset- then -help ivreset- when installed has a
> > > > excellent
> > > > > > exposition that begins:
> > > > > >
> > > > > > As Pagan and Hall (1983) and Pesaran and Taylor (1999)
> > > > point out, a
> > > > > > RESET test for an IV regression cannot use the standard
> > > > IV predicted
> > > > > > values X*beta-hat, because X includes endogenous
> > > > regressors that are
> > > > > > correlated with u.
> > > > > >
> > > > > > Try this code instead:
> > > > > >
> > > > > > use http://fmwww.bc.edu/ec-p/data/hayashi/griliches76.dta
> > > > > > qui ivreg2 lw s expr tenure rns smsa (iq=med kww) predict
> > > > ytilde mat
> > > > > > b=e(b) mat li b qui regress iq s expr tenure rns smsa med kww
> > > > > > qui predict double xh gen yhat=ytil-b[1,1]*iq+b[1,1]*xh sum
> > > > > > yhat, meanonly qui replace yhat =
> > (yhat-r(min))/(r(max)-r(min))
> > > > > > qui gen double yhat2=yhat^2 qui ivreg2 lw s expr
> > tenure rns smsa
> > > > > > yhat2 (iq=med kww) test yhat2 qui ivreg2 lw s expr tenure rns
> > > > smsa (iq=med
> > > > > > kww) ivreset
> > > > > >
> > > > > > Now, as to why
> > > > > > replace yhat = (yhat-r(min))/(r(max)-r(min)) I cannot
> > > > tell you, but
> > > > > > it's in ivreset.ado
> > > > >
> > > > > It's to rescale yhat so that when it's squared, cubed, etc., it
> > > > > doesn't get wildly out of scale relative to the other
> > regressors.
> > > > > This can cause problems for the regression that includes
> > > > these terms.
> > > > >
> > > > > Cheers,
> > > > > Mark
> > > > >
> > > > >
> > > > > > On 16 Mar 2006 19:10:13 +0000, Arne Risa Hole wrote:
> > > > > > > I am using ivreset to do a Pesaran-Taylor Reset test
> > > > after ivreg2.
> > > > > > > However, I am not able to replicate the result from ivreset
> > > > > > manually. For example:
> > > > > > >
> > > > > > > use http://fmwww.bc.edu/ec-p/data/hayashi/griliches76.dta
> > > > > >
> > > > > >
> > > > >
> > > > > *
> > > > > * For searches and help try:
> > > > > * http://www.stata.com/support/faqs/res/findit.html
> > > > > * http://www.stata.com/support/statalist/faq
> > > > > * http://www.ats.ucla.edu/stat/stata/
> > > > >
> > > >
> > > > *
> > > > * For searches and help try:
> > > > * http://www.stata.com/support/faqs/res/findit.html
> > > > * http://www.stata.com/support/statalist/faq
> > > > * http://www.ats.ucla.edu/stat/stata/
> > > >
> > > >
> > >
> > > *
> > > * For searches and help try:
> > > * http://www.stata.com/support/faqs/res/findit.html
> > > * http://www.stata.com/support/statalist/faq
> > > * http://www.ats.ucla.edu/stat/stata/
> > >
> >
> > *
> > * For searches and help try:
> > * http://www.stata.com/support/faqs/res/findit.html
> > * http://www.stata.com/support/statalist/faq
> > * http://www.ats.ucla.edu/stat/stata/
> >
> >
>
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/