Stata 15 help for hausman

[R] hausman -- Hausman specification test

Syntax

hausman name-consistent [name-efficient] [, options]

options Description ------------------------------------------------------------------------- Main constant include estimated intercepts in comparison; default is to exclude alleqs use all equations to perform test; default is first equation only skipeqs(eqlist) skip specified equations when performing test equations(matchlist) associate/compare the specified (by number) pairs of equations force force performance of test, even though assumptions are not met df(#) use # degrees of freedom sigmamore base both (co)variance matrices on disturbance variance estimate from efficient estimator sigmaless base both (co)variance matrices on disturbance variance estimate from consistent estimator

Advanced tconsistent(string) consistent estimator column header tefficient(string) efficient estimator column header -------------------------------------------------------------------------

where name-consistent and name-efficient are names under which estimation results were stored via estimates store. A period (.) may be used to refer to the last estimation results, even if these were not already stored. Not specifying name-efficient is equivalent to specifying the last estimation results as ".".

Menu

Statistics > Postestimation

Description

hausman performs Hausman's specification test.

Options

+------+ ----+ Main +-------------------------------------------------------------

constant specifies that the estimated intercept(s) be included in the model comparison; by default, they are excluded. The default behavior is appropriate for models in which the constant does not have a common interpretation across the two models.

alleqs specifies that all the equations in the models be used to perform the Hausman test; by default, only the first equation is used.

skipeqs(eqlist) specifies in eqlist the names of equations to be excluded from the test. Equation numbers are not allowed in this context, because the equation names, along with the variable names, are used to identify common coefficients.

equations(matchlist) specifies, by number, the pairs of equations that are to be compared.

The matchlist in equations() should follow the syntax

#c:#e [,#c:#e[, ...]]

where #c(#e) is an equation number of the always-consistent (efficient under H0) estimator. For instance equations(1:1), equations(1:1, 2:2), or equations(1:2).

If equations() is not specified, then equations are matched on equation names.

equations() handles the situation in which one estimator uses equation names and the other does not. For instance, equations(1:2) means that equation 1 of the always-consistent estimator is to be tested against equation 2 of the efficient estimator. equations(1:1, 2:2) means that equation 1 is to be tested against equation 1 and that equation 2 is to be tested against equation 2. If equations() is specified, the alleqs and skipeqs options are ignored.

force specifies that the Hausman test be performed, even though the assumptions of the Hausman test seem not to be met, for example, because the estimators were pweighted or the data were clustered.

df(#) specifies the degrees of freedom for the Hausman test. The default is the matrix rank of the variance of the difference between the coefficients of the two estimators.

sigmamore and sigmaless specify that the two covariance matrices used in the test be based on a common estimate of disturbance variance (sigma2).

sigmamore specifies that the covariance matrices be based on the estimated disturbance variance from the efficient estimator. This option provides a proper estimate of the contrast variance for so-called tests of exogeneity and overidentification in instrumental-variables regression.

sigmaless specifies that the covariance matrices be based on the estimated disturbance variance from the consistent estimator.

These options can be specified only when both estimators store e(sigma) or e(rmse), or with the xtreg command. e(sigma_e) is stored after the xtreg command with the fe or mle option. e(rmse) is stored after the xtreg command with the re option.

sigmamore or sigmaless are recommended when comparing fixed-effects and random-effects linear regression because they are much less likely to produce a non-positive-definite-differenced covariance matrix (although the tests are asymptotically equivalent whether or not one of the options is specified).

+----------+ ----+ Advanced +---------------------------------------------------------

tconsistent(string) and tefficient(string) are formatting options. They allow you to specify the headers of the columns of coefficients that default to the names of the models. These options will be of interest primarily to programmers.

Remarks

The assumption that one of the estimators is efficient (that is, has minimal asymptotic variance) is a demanding one. It is violated, for instance, if your observations are clustered or pweighted, or if your model is somehow misspecified. Moreover, even if the assumption is satisfied, there may be a "small sample" problem with the Hausman test. Hausman's test is based on estimating the variance var(b-B) of the difference of the estimators by the difference var(b)-var(B) of the variances. Under the assumptions (1) and (3), var(b)-var(B) is a consistent estimator of var(b-B), but it is not necessarily positive definite "in finite samples", that is, in your application. If this is the case, the Hausman test is undefined. Unfortunately, this is not a rare event. Stata supports a generalized Hausman test that overcomes both of these problems. See [R] suest for details.

To use hausman, perform the following steps.

(1) obtain an estimator that is consistent whether or not the hypothesis is true; (2) store the estimation results under name-consistent by using estimates store; (3) obtain an estimator that is efficient (and consistent) under the hypothesis that you are testing, but inconsistent otherwise; (4) store the estimation results under name-efficient by using estimates store; (5) use hausman to perform the test

hausman name-consistent name-efficient [, options]

The order of computing the two estimators may be reversed. You have to be careful, though, to specify to hausman the models in the order "always consistent" first and "efficient under H0" second. It is possible to skip storing the second model and refer to the last estimation results by a period (.).

hausman may be used in any context. The order in which you specify the regressors in each model does not matter, but you must ensure that the estimators and models are comparable and that they satisfy the theoretical conditions (see (1) and (3) above).

Examples

--------------------------------------------------------------------------- Setup . webuse nlswork4 . xtreg ln_wage age msp ttl_exp, fe . estimates store fixed . xtreg ln_wage age msp ttl_exp, re

Test the appropriateness of the random-effects estimator (xtreg, re) . hausman fixed ., sigmamore

--------------------------------------------------------------------------- Setup . webuse sysdsn3 . mlogit insure age male . estimates store all . mlogit insure age male if insure != "Uninsure":insure . estimates store partial

Perform Hausman test for independence of irrelevant alternatives . hausman partial all, alleqs constant

--------------------------------------------------------------------------- Setup . sysuse auto . regress mpg price . estimates store reg . heckman mpg price, select(foreign=weight)

Specify equations() option to force comparison when one estimator uses equation names and the other does not . hausman reg ., equation(1:1)

Setup . probit foreign weight . estimates store probit_for . heckman mpg price, select(foreign=weight)

Compare probit model and selection equation of heckman model . hausman probit_for ., equation(1:2)

---------------------------------------------------------------------------

Stored results

hausman stores the following in r():

Scalars r(chi2) chi-squared r(df) degrees of freedom for the statistic r(p) p-value for the chi-squared r(rank) rank of (V_b-V_B)^(-1)


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index