Notice: On March 31, it was **announced** that Statalist is moving from an email list to a **forum**. The old list will shut down on April 23, and its replacement, **statalist.org** is already up and running.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Maarten Buis <maartenlbuis@gmail.com> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Jarque-Bera test |

Date |
Thu, 27 Sep 2012 15:00:57 +0200 |

On Thu, Sep 27, 2012 at 3:44 AM, Nick Cox wrote: > The essence of the matter is that Jarque-Bera uses asymptotic results > regardless of sample size for a problem in which convergence to those > results is very slow. This approach is decades out of date and I am > surprised that StataCorp support the test without a warning. The > Doornik-Hansen test, for example, looks much more satisfactory. I took up this challenge and did a simulation comparing the performance of the Jarque-Bera test with the Doornik-Hansen test. In particular I focused on whether the p-value follow a uniform distribution, i.e. whether the nominal rejection rates correspond with the proportion of simulations in which the test was rejected at those nominal rates. In essence both tests perform badly at sample sizes of a 100 and a 1,000. As Nick suggested, the Jarque-Bera test's perfomance is more awful than the performance of the Doornik-Hansen test, but for both tests my conclusion would be that a 1,000 observations is just not enough for either test. At 10,000 and 100,000 observations both tests seem to perform acceptable. However, at such large sample sizes you need to worry about whether a rejection of the null-hypothesis actually represents a substantively meaningful deviation from the normal/Gaussian distribution. So the bottom line is: at small sample sizes graphs are the only reliable way of judging whether a variable comes from a normal/Gaussian distribution because tests just don't perform well enough. At large sample sizes graphs are still the only reliable way of judging whether a variable comes from a normal/Gaussian distribution because in large sample sizes tests will pick up substantively meaningless deviations from the null-hypothesis. *------------------- begin simulation ------------------- clear all program define sim, rclass drop _all set obs `=1e5' gen x = rnormal() tempname jb jbp forvalues i = 2/5 { sum x in 1/`=1e`i'', detail scalar `jb' = (r(N)/6) * /// (r(skewness)^2 + 1/4*(r(kurtosis) - 3)^2) scalar `jbp' = chi2tail(2,`jb') return scalar jb`i' = `jb' return scalar jbp`i' = `jbp' mvtest norm x in 1/`=1e`i'' return scalar dh`i' = r(chi2_dh) return scalar dhp`i' = r(p_dh) } end simulate jb2=r(jb2) jbp2=r(jbp2) /// jb3=r(jb3) jbp3=r(jbp3) /// jb4=r(jb4) jbp4=r(jbp4) /// jb5=r(jb5) jbp5=r(jbp5) /// dh2=r(dh2) dhp2=r(dhp2) /// dh3=r(dh3) dhp3=r(dhp3) /// dh4=r(dh4) dhp4=r(dhp4) /// dh5=r(dh5) dhp5=r(dhp5) /// , reps(2e4): sim rename jbp2 p2jb rename jbp3 p3jb rename jbp4 p4jb rename jbp5 p5jb rename dhp2 p2dh rename dhp3 p3dh rename dhp4 p4dh rename dhp5 p5dh gen id = _n reshape long p2 p3 p4 p5, i(id) j(dist) string label var p2 "N=100" label var p3 "N=1,000" label var p4 "N=10,000" label var p5 "N=100,000" encode dist, gen(distr) label define distr 2 "Jarque-Bera" /// 1 "Doornik-Hansen", replace label value distr distr simpplot p?, by(distr) scheme(s2color) legend(cols(4)) *-------------------- end simulation -------------------- (For more on examples I sent to the Statalist see: http://www.maartenbuis.nl/example_faq ) This simulation requires the -simpplot- package available at SSC and described here: <http://www.maartenbuis.nl/software/simpplot.html> -- Maarten --------------------------------- Maarten L. Buis WZB Reichpietschufer 50 10785 Berlin Germany http://www.maartenbuis.nl --------------------------------- * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Jarque-Bera test***From:*Nick Cox <njcoxstata@gmail.com>

**References**:**st: Jarque-Bera test***From:*Nick Cox <njcoxstata@gmail.com>

- Prev by Date:
**st: odds ratio** - Next by Date:
**Re: st: Jarque-Bera test** - Previous by thread:
**Re: st: Jarque-Bera test** - Next by thread:
**Re: st: Jarque-Bera test** - Index(es):