Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.

# Re: st: heteroskedasticity test in panel data

 From "Michael N. Mitchell" To statalist@hsphsun2.harvard.edu Subject Re: st: heteroskedasticity test in panel data Date Tue, 27 Jul 2010 18:52:41 -0700

Dear Jing (and all)

I am afraid I am out of my element here... does anyone have an idea of what might be going on here?

Michael N. Mitchell
Data Management Using Stata      - http://www.stata.com/bookstore/dmus.html
A Visual Guide to Stata Graphics - http://www.stata.com/bookstore/vgsg.html
Stata tidbit of the week         - http://www.MichaelNormanMitchell.com

On 2010-07-27 6.11 PM, Jing Zhou wrote:
Dear Michael,

follow your suggestions, i rescaled my variables in panel model, but still the problems exist.

I tried another method.  I reexamined the first model by removing "igls", and there is no variable omitted and the SEs are acceptable. but when I run "lrtest hetero ., df('df')", the result shows wrong information as "hetero does not contain scalar e(ll)". I attached the outcomes below. Could you advise me why it happens? Thank you.

Jing

. xtgls roa tlawn genvironment aci2 size leverage age, panels (heteroskedastic)

Cross-sectional time-series FGLS regression

Coefficients:  generalized least squares
Panels:        heteroskedastic
Correlation:   no autocorrelation

Estimated covariances      =       621          Number of obs      =      2916
Estimated autocorrelations =         0          Number of groups   =       621
Estimated coefficients     =         7          Obs per group: min =         1
avg =  4.695652
max =        10
Wald chi2(6)       =  15124.70
Prob>  chi2        =    0.0000

------------------------------------------------------------------------------
roa |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
tlawn |   .0713694   .0024424    29.22   0.000     .0665823    .0761564
genvironment |   .0012895   .0004232     3.05   0.002     .0004601    .0021188
aci2 |   .0240381   .0015218    15.80   0.000     .0210555    .0270207
size |   .0182393   .0006986    26.11   0.000       .01687    .0196085
leverage |  -.0391871   .0030122   -13.01   0.000    -.0450909   -.0332833
age |  -.0039249   .0001145   -34.27   0.000    -.0041493   -.0037004
_cons |  -.3728419   .0144485   -25.80   0.000    -.4011604   -.3445234
------------------------------------------------------------------------------

. estimates store hetero

.
. xtgls roa tlawn genvironment aci2 size leverage age

Cross-sectional time-series FGLS regression

Coefficients:  generalized least squares
Panels:        homoskedastic
Correlation:   no autocorrelation

Estimated covariances      =         1          Number of obs      =      2916
Estimated autocorrelations =         0          Number of groups   =       621
Estimated coefficients     =         7          Obs per group: min =         1
avg =  4.695652
max =        10
Wald chi2(6)       =    372.93
Log likelihood             =  855.1189          Prob>  chi2        =    0.0000

------------------------------------------------------------------------------
roa |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
tlawn |   .1003806   .0162142     6.19   0.000     .0686014    .1321598
genvironment |   .0015588   .0021767     0.72   0.474    -.0027075    .0058251
aci2 |   .0244187    .006892     3.54   0.000     .0109106    .0379268
size |   .0202311   .0034783     5.82   0.000     .0134137    .0270485
leverage |  -.0176257   .0013651   -12.91   0.000    -.0203012   -.0149502
age |  -.0039722   .0007833    -5.07   0.000    -.0055075    -.002437
_cons |  -.4569187   .0729608    -6.26   0.000    -.5999192   -.3139181
------------------------------------------------------------------------------

.
. local df=e(N_g)-1

.
. display e(N_g)-1
620

. lrtest hetero ., df(620)
hetero does not contain scalar e(ll)
r(498);

"Michael N. Mitchell"<Michael.Norman.Mitchell@gmail.com>  28/07/2010 2:03 am>>>
Dear Jing

Perhaps my last bit of advice was slightly misguided... maybe the model would benefit
from a rescaling of the outcome variable. I am just very concerned about those extremely
tiny standard errors, because computationally I wonder how close they are to 0. (And, thus
how many other quantities are getting close to 0). If that does not solve the problem of
your predictors being dropped in the first model, then that problem needs to be solved
first. My hunch is that it is an issue of multicollinearity, but I am unaware of how to
examine that in the context of a panel model. Maybe others have suggestions?

Best luck!

Michael N. Mitchell
Data Management Using Stata      - http://www.stata.com/bookstore/dmus.html
A Visual Guide to Stata Graphics - http://www.stata.com/bookstore/vgsg.html
Stata tidbit of the week         - http://www.MichaelNormanMitchell.com

On 2010-07-27 1.30 AM, Jing Zhou wrote:
Dear Michael,

Thank you for your suggestions. in fact the predictor in my model is a percentage which should not be very large. however, i still follow your suggestions to divide other large regressors. the results are unchanged (p value of 1.000), and in the first model some regressors are also omitted. I am wondering how this omitted variables happened?

Thanks.

Jing

"Michael N. Mitchell"<Michael.Norman.Mitchell@gmail.com>   27/07/2010 6:02 pm>>>
Dear Jing

I think this is very informative. I notice two issues...

1) The terms -aci2-, -leverage-, and the constant (_cons) were omitted from the first model.

2) The standard errors are extremely tiny, and the coefficients for the terms that are
present are very small.

I wonder if you have an issue with the scaling of the variables, and that your model is
not being estimated very stably because the units of the variables are very large. You
might try dividing the predictor by a constant, e.g.

. generate tlawnew = tlaw / 1000

and then entering the "new" variable. I think this might lead to a more stable estimate
of the first model, and then different results with respect to the chi-squared test.

Best regards,

Michael N. Mitchell
Data Management Using Stata      - http://www.stata.com/bookstore/dmus.html
A Visual Guide to Stata Graphics - http://www.stata.com/bookstore/vgsg.html
Stata tidbit of the week         - http://www.MichaelNormanMitchell.com

On 2010-07-27 12.51 AM, Jing Zhou wrote:
following is the command and corresponding output.

. xtgls roa tlaw genvironment aci2 size leverage age, igls panels (heteroskedastic)
Iteration 1: tolerance = .01281716
Iteration 2: tolerance = .01676558
Iteration 3: tolerance = .25025852
Iteration 4: tolerance = .00706137
Iteration 5: tolerance = .04061494
Iteration 6: tolerance = .03815978
Iteration 7: tolerance = .03675714
Iteration 8: tolerance = .02342555
Iteration 9: tolerance = .00073142
Iteration 10: tolerance = .00832932
Iteration 11: tolerance = 3.144e-06
Iteration 12: tolerance = 1.718e-07
Iteration 13: tolerance = .1305574
Iteration 14: tolerance = .11548056
Iteration 15: tolerance = .08959096
Iteration 16: tolerance = .02050352
Iteration 17: tolerance = .006188
Iteration 18: tolerance = .02034936
Iteration 19: tolerance = .01040934
Iteration 20: tolerance = .0073191
Iteration 21: tolerance = .00270878
Iteration 22: tolerance = .00243333
Iteration 23: tolerance = .00237504
Iteration 24: tolerance = .14171418
Iteration 25: tolerance = .00958554
Iteration 26: tolerance = .00850144
Iteration 27: tolerance = .00094421
Iteration 28: tolerance = .02799819
Iteration 29: tolerance = 8.475e-06
Iteration 30: tolerance = .00224329
Iteration 31: tolerance = .11496823
Iteration 32: tolerance = .0108985
Iteration 33: tolerance = .00491695
Iteration 34: tolerance = .01146044
Iteration 35: tolerance = .11495675
Iteration 36: tolerance = .00775622
Iteration 37: tolerance = .00769652
Iteration 38: tolerance = .00452005
Iteration 39: tolerance = .00376106
Iteration 40: tolerance = .00165737
Iteration 41: tolerance = .00165462
Iteration 42: tolerance = .00148306
Iteration 43: tolerance = .00311958
Iteration 44: tolerance = .00028596
Iteration 45: tolerance = .00036032
Iteration 46: tolerance = .00211196
Iteration 47: tolerance = .0600343
Iteration 48: tolerance = .0023866
Iteration 49: tolerance = .01014685
Iteration 50: tolerance = .06387619
Iteration 51: tolerance = .07202545
Iteration 52: tolerance = .02556249
Iteration 53: tolerance = .00008123
Iteration 54: tolerance = .00004186
Iteration 55: tolerance = .00175812
Iteration 56: tolerance = .05552171
Iteration 57: tolerance = .01552817
Iteration 58: tolerance = .01716332
Iteration 59: tolerance = .02063742
Iteration 60: tolerance = .01274508
Iteration 61: tolerance = .00920043
Iteration 62: tolerance = .12077282
Iteration 63: tolerance = .00905253
Iteration 64: tolerance = .01079828
Iteration 65: tolerance = .03328352
Iteration 66: tolerance = .01233767
Iteration 67: tolerance = .00929827
Iteration 68: tolerance = .05281334
Iteration 69: tolerance = .03867031
Iteration 70: tolerance = .01011156
Iteration 71: tolerance = .00011164
Iteration 72: tolerance = .00999907
Iteration 73: tolerance = 7.644e-08

Cross-sectional time-series FGLS regression

Coefficients:  generalized least squares
Panels:        heteroskedastic
Correlation:   no autocorrelation

Estimated covariances      =       621          Number of obs      =      2916
Estimated autocorrelations =         0          Number of groups   =       621
Estimated coefficients     =         3          Obs per group: min =         1
avg =  4.695652
max =        10
Wald chi2(3)       =  4.40e+13
Log likelihood             =   4073.23          Prob>    chi2        =    0.0000

------------------------------------------------------------------------------
roa |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
tlaw |    .000556   8.73e-09  6.4e+04   0.000      .000556     .000556
genvironment |   .0013927   4.93e-08  2.8e+04   0.000     .0013926    .0013928
aci2 |  (omitted)
size |   .0003605   5.94e-08  6065.32   0.000     .0003604    .0003606
leverage |  (omitted)
age |  -.0030722   6.90e-09 -4.5e+05   0.000    -.0030722   -.0030722
_cons |  (omitted)
------------------------------------------------------------------------------

. estimates store hetero

. xtgls roa tlaw genvironment aci2 size leverage age

Cross-sectional time-series FGLS regression

Coefficients:  generalized least squares
Panels:        homoskedastic
Correlation:   no autocorrelation

Estimated covariances      =         1                Number of obs      =      2916
Estimated autocorrelations =       0              Number of groups   =       621
Estimated coefficients     =         7                  Obs per group: min =         1
avg =  4.695652
max =        10
Wald chi2(6)       =    372.93
Log likelihood             =  855.1189          Prob>    chi2        =    0.0000

------------------------------------------------------------------------------
roa |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
tlaw |   .0010038   .0001621     6.19   0.000      .000686    .0013216
genvironment |   .0015588   .0021767     0.72   0.474    -.0027075    .0058251
aci2 |   .0244187    .006892     3.54   0.000     .0109106    .0379268
size |   .0202311   .0034783     5.82   0.000     .0134137    .0270485
leverage |  -.0176257   .0013651   -12.91   0.000    -.0203012   -.0149502
age |  -.0039722   .0007833    -5.07   0.000    -.0055075    -.002437
_cons |  -.4569186   .0729608    -6.26   0.000    -.5999192   -.3139181
------------------------------------------------------------------------------

. local df=e(N_g)-1

. display e(N_g)-1
620

.
end of do-file

. lrtest hetero ., df(620)

Likelihood-ratio test                                            LR chi2(620)=  -6436.22
(Assumption: hetero nested in .)                       Prob>    chi2 =    1.0000

Thank you.

Jing

"Michael N. Mitchell"<Michael.Norman.Mitchell@gmail.com>    27/07/2010 5:37 pm>>>
Dear Jing

Based on reading the FAQ (at http://www.stata.com/support/faqs/stat/panel.html) and the
results you report, it sounds like your data do not show heteroskedasticity across panels.
But, at the same time, I share your concern about getting a p value of 1.000. Perhaps you
could post your commands and output (suppressing any output that you need to suppress for
privacy/confidentiality) so we might be able to see any clues of trouble.

Best regards,

Michael N. Mitchell
Data Management Using Stata      - http://www.stata.com/bookstore/dmus.html
A Visual Guide to Stata Graphics - http://www.stata.com/bookstore/vgsg.html
Stata tidbit of the week         - http://www.MichaelNormanMitchell.com

On 2010-07-26 11.40 PM, Jing Zhou wrote:
Dear Michael,

Thank you for your kind assistance. follow the recommended commands on FAQs, and your suggestion, i run this test in stata. the result is however a little weird. the value of df is large (620), and Prob>     chi2 =    1.0000. Can i just conclude that my panel data is not exposed to heteroskedasticity from this result? or there still exists some problem in the process? Thanks!

Jing

"Michael N. Mitchell"<Michael.Norman.Mitchell@gmail.com>     27/07/2010 3:24 pm>>>
Dear Jing

Based on your example, it looks like you could do this...

. xtgls..., igls panels (heteroskedastic)
. estimates store hetero
. xtgls...
. display e(N_g)-1

The last command will show, I believe, the number of groups minus 1. It looks like your
example uses this for the degrees of freedom. Say that number was 157. You could then type

. lrtest hetero ., df (157)

and it looks like it would use 157 as the df. I am out of my element here, so I trust
that someone else will correct me if I am off base. But I hope this helps.

Michael N. Mitchell
Data Management Using Stata      - http://www.stata.com/bookstore/dmus.html
A Visual Guide to Stata Graphics - http://www.stata.com/bookstore/vgsg.html
Stata tidbit of the week         - http://www.MichaelNormanMitchell.com

On 2010-07-26 9.58 PM, Jing Zhou wrote:
thank you Michael, for the command "lrtest hetero ., df ('df')", how can i get the value of df?

Jing

"Michael N. Mitchell"<Michael.Norman.Mitchell@gmail.com>      27/07/2010 2:23 pm>>>
Greetings

I wonder if this would help...

. set matsize 800

(or select another number in place of 800).

Hope that helps,

Michael N. Mitchell
Data Management Using Stata      - http://www.stata.com/bookstore/dmus.html
A Visual Guide to Stata Graphics - http://www.stata.com/bookstore/vgsg.html
Stata tidbit of the week         - http://www.MichaelNormanMitchell.com

On 2010-07-26 7.57 PM, Jing Zhou wrote:
Dear All,

I am going to test the heteroskedasticity in my panel data. by using the recommended commands on FAQ which are specified as:

xtgls..., igls panels (heteroskedastic)
estimates store hetero
xtgls...
local df=e (N_g)-1
lrtest hetero., df ('df')

the result shows wrong information as "matsize too small - should be at least 621". Could you please advise me what is the potential cause to this problem? and how can i refine it?

Many thanks!

Jing

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/