Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Regarding ANOVA models


From   khigbee@stata.com
To   statalist@hsphsun2.harvard.edu, mehryar@hotmail.com
Subject   Re: st: Regarding ANOVA models
Date   Tue, 04 Feb 2003 14:01:35 -0600

Ali Karim <mehryar@hotmail.com> asked about a difference in the
SS for the term "round" when running

    . anova close sex round round*sex formno|sex round*formno|sex

(which produced a coefficient of .946258154 for the round term)
versus dropping the last term so that it appears as residual

    . anova close sex round round*sex formno|sex

(which produced a coefficient of .934108527) with all other
coefficients remaining the same.

I then asked for the data, and Ali has sent it to me -- thanks.
I have been looking at the data.  My first suspicion was that I
would find some typo in the "formno" variable that would reveal
that the data were not actually nested as claimed.  That was not
the case.  "formno" looked fine.  I looked at the various 2-way
tables and the 3-way table and nothing strange appeared.

I first checked this, because in the few cases where a question
like this has come our way, the problem has been that the data do
not match what the model specifies (for instance an id number
misentered so that the same id number happens both for a male and
a female subject so the data are not actually nested as claimed
by the model).  The resulting differences in the ANOVAs can then
be explained in terms of the data anomaly.

I then started looking at the dependent variable "close".  This
variable is binary (0/1) instead of a continuous variable.  Some
argue that you should not use ANOVA on binary data, or at least
you can not really believe the test statistics and resulting
p-values since the assumption of normality is violated.  This
alone, however, does not explain the difference in output from
the 2 models Ali ran.

What I believe is at the heart of the difference is the lack of
variation in the data.  Of the 129 "formno"s, only 32 show any
change in the binary dependent variable "close" over the four
"round"s.  So 3/4ths of the data have no variability (all zeros
or all ones for the particular formno across the 4 rounds).

One helpful table is obtained with

    . sort sex formno round
    . by sex: table formno round close

I think what is happening is analogous to when you run a -logit-
(or many of the other binary outcome models) and you get messages
such as

    note:  xyz != 0 predicts success perfectly
           xyz dropped and 4 obs not used

and

    note:  4 failures and 0 successes completely determined


As an example, I looked at

    . xi : logit close i.sex*i.round i.formno

It dropped a lot of the observations (with notes like those I
show above) since so many did not have any variation.  But, it
still should have dropped one more of the "formno".  If Ali runs
this logit on the data, Ali will see one entry of the logit table
that has missing std errors.  See

    http://www.stata.com/support/faqs/stat/logitcd.html

for why and for guidance.  You need to drop the dummy variable
that gets the missing standard error and then run the logit over.
In my case I said

    . drop _Iformno_320
    . logit close _I*

followed by

    . testparm _Isex_*
    . testparm _Iround_*
    . testparm _IsexXrou*
    . testparm _Iform*

to examine some tests that might be of interest.

Ali may want to consider the various binary outcome models that
are available in Stata.  There are other alternatives to using
-logit- and -xi-.  Ali might want to look at -desmat- (user
written command) as an alternative to -xi- for creating dummy
variables, and might also want to explore the -xt- series of
commands (such as -xtlogit-) that deal with cross-sectional data.

In terms of seeing what trend might be happening in the data, I
learned a lot from looking at

    . table round close sex

    ------------------------------------
              | q101. record sex of the 
              |   respondent and close  
              | - female -    -- male --
        round |    0     1       0     1
    ----------+-------------------------
            0 |   13    30      44    42
            1 |   11    32      36    50
            2 |   11    32      29    57
            3 |   13    30      25    61
    ------------------------------------

The anova's that Ali ran, the -logit-s that I ran, and this table
seem to indicate that there is an interaction between round and
sex.  It looks like the females remain stable over the four
rounds with about 70 to 75% of them having a 1 for "close", while
the males start out at about 50% and grow to about 70% of 1s for
"close" over the rounds.


Ken Higbee    khigbee@stata.com
StataCorp     1-800-STATAPC

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index