# st: RE: chitesti -- warning -- expected

 From "Nick Cox" To Subject st: RE: chitesti -- warning -- expected Date Wed, 10 Mar 2004 23:02:02 -0000

```Good question.

-chitesti- and its sibling -chitest- are in a package
-tab_chi- on SSC. The latest public versions of -chitesti-
and -chitest- are 2.0.0, both from July 2003.

The immediate command -chitesti- in fact calls -chitest-
(with a secret handshake indicating keyboard input).

What happens internally is that the observed and
expected frequencies are put in -float- variables.
the digits you want to hold. I make the expected
frequencies

406694.3598 and 29766.6402

which add to 436441 exactly by virtue of 0.9318 +
0.0682 equalling 1. However, putting them in a -float-
and then getting the total yields
436461.015625. Of course everything is done in binary
and we are just seeing the decimal representation here.

Here is that difference in hexadecimal:

. di %21x 436461.015625
+1.aa3b410000000X+012

. di %21x 436461
+1.aa3b400000000X+012

So near, and yet so far!

Now -chitest- squawks if the sum of observed and
the sum of expected differ by more than 0.01 and the
difference here of 0.015625 qualifies.

The absolute difference criterion of 0.01 was just
plucked out of the air when -chitest- was first
written several years ago. For numbers as big as yours
a relative difference criterion would presumably
make more sense.

Why then is -chitest- telling you both that
numbers are the same and that they are
different? "Same" comes from the display
statement, here equivalent to

. di %8.0g 436461.015625
436461

That format in turn was based on getting
integers to show as such as far as at all possible,
without irritating extra ".00000" or whatever.
The format here loses the small details, however.

"Different" comes from the numbers held
in memory, which differ by 0.015625.

I just rewrote -chitest- and -chitesti- to use
doubles throughout. The results are better:

. chitesti 314795 121666 \ 0.9318*436461 0.0682*436461

observed frequencies from keyboard; expected frequencies from keyboard

Pearson chi2(1) =  3.0e+05   Pr =  0.000
likelihood-ratio chi2(1) =  1.8e+05   Pr =  0.000

+-----------------------------------------------+
| observed     expected    obs - exp    Pearson |
|-----------------------------------------------|
|   314795   406694.360   -91899.360   -144.105 |
|   121666    29766.640    91899.360    532.657 |
+-----------------------------------------------+

In short, this is a salutary lesson in precision. The
program author should perhaps read e.g.

http://www.stata.com/support/faqs/data/mod.html

The defence, if there is one, is that the author
grew up in a small house in a small country and still
thinks that using -double- where -float- apparently
will do fine is profligate use of space.

Incidentally, the chi-square test shows a P-value
indistinguishable from 0.

Nick
n.j.cox@durham.ac.uk

> -----Original Message-----
> From: owner-statalist@hsphsun2.harvard.edu
> [mailto:owner-statalist@hsphsun2.harvard.edu]On Behalf Of
> Benoit Dulong
> Sent: 10 March 2004 21:53
> To: statalist
> Subject: st: chitesti -- warning -- expected
>
>
> The command
> chitesti 314795 121666 \ 0.9318*436461 0.0682*436461
> produced
>
> Chi-square test:
> observed frequencies from keyboard
> expected frequencies from keyboard
>
> Warning: totals of observed and expected differ
>               total
> observed     436461
> expected     436461
>
>          Pearson chi2(1) = 304489.6035   Pr =  0.000
> likelihood-ratio chi2(1) = 181321.5938   Pr =  0.000
>
>                                    residuals
>       observed    expected     classic     Pearson
>   1.    314795  406694.375  -91899.375    -144.105
>   2.    121666   29766.641   91899.359     532.657
>
> ------------------------------------------------------
>
> QUESTION-1.
> I do not understand the warning because
> observed and expected do NOT differ ?
>
> QUESTION-2
> Expected (1) should be 436461*0.9318 = 406694.3598,
> not 406694.375 ?

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```