RE: st: gologit2

 From Maarten buis To statalist@hsphsun2.harvard.edu Subject RE: st: gologit2 Date Wed, 16 Apr 2008 10:01:25 +0100 (BST)

```--- Richard Williams wrote:
> >>>You might also want to consider more
> stringent alpha levels (e.g. .01, .001) to reduce the possibility of
> capitalizing on chance.  You can also try to assess the practical
> significance of violations, e.g. do my conclusions and/or predicted
> probabilities really change that much if I stick with the model whose
> assumptions are violated as opposed to a (possibly much harder to
> understand and interpret) model whose assumptions are not
> violated.<<<

--- "Verkuilen, Jay" wrote:
> Right, this is about what I was thinking---be a more stringent about
> the test. I wonder if anyone's done good simulation studies to see
the
> properties of the Brant test?

I don't know of any such simulation in the literature, but you can use
-simulate- to do it yourself. In the example below I draw at random
from a population in which the proportional odds assumption holds, and
than use the Brant test to test that assumption. For this I use the
-brant- command which is part of the -spost- package (see:
-findit spost-). I than repeat this 10,000 times. If the Brant test
works then it should reject the null hypothesis in 5% of the draws.

Because Rich and Jay (and I) think that a potential problem with the
Brant test is that it test many things at once (The odds is
proportional for all variables and and for all equations) I repeat this
eexperiment for 1, 2, .., 10, 12, 14, .., 20 explanatory variables. If
our hunch is correct, the Brant test should do worse (reject the true
null hypothesis more than 5% of the samples) in models with more
explanatory variables.

Because running this simulation takes quite a while, I will give the
results below:

# of Xs  | % reject H0
----------------------
1       |  5.18
2       |  5.55
3       |  5.19
4       |  5.49
5       |  5.61
6       |  5.39
7       |  5.73
8       |  5.88
9       |  6.09
10       |  6.33
12       |  7.06
14       |  7.58
16       |  8.67
18       |  9.75
20       | 11.85
----------------------

So, in this simulation the Brant test seems to perform reasonable well
upto 10 explanatory variables, but than starts to noticably deviate
from the nominal 5%. Do not read to much in this number of covariates:
the data were simulated to be very well behaved, in real data that
number may be much smaller. Moreover, this number is likely to depend
on the number of categories in your dependent variable as well (in the
simulation there are four categories).

The results seem to be that the Brant test is a bit problematic in
larger models, but even if that weren't the case you should not always
avoid using -ologit- whenever the Brant test says that it rejects the
proportional odds asssumption. The reason is that rejecting the null
hypothesis may not be that informative. We (almost) never believe that
a null hypothesis is exactly true. This is especially true in case of a
test of a model, like the Brant test, because it is the very purpose of
a model to be wrong (in a special way). This may be a bit provocative,
so let me elobarate: A model exist to simplify your observations, such
that you can relate it your theory. You need to simplify because the
patterns in the raw data are too complicated to be understood by just
looking at the data. Simplifying is just a special case of being wrong.
So, the purpose of a model is to be wrong in a special way. So, if the
Brant test rejects the proportionality assumption, it is up to you to
determine whether the proportionality assumption is still acceptable as
a simplification or not.

Below is the code I used for this simulation, in case you want to
replicate my results, or want to expand on it, for instance by creating
a dependent variable with more categories, or with one or more sparse
categories (categories with few observations).

*--------------------- begin simulation -----------------
set more off
set seed 12345

capture program drop sim

program define sim, rclass
syntax, [nx(integer 1)]
drop _all
set obs 500
forvalues i = 1/`nx' {
gen x`i' = invnorm(uniform())
local x `x' x`i'
}
local x : list retokenize x
local xsum : subinstr local x " " " + ", all
gen u = uniform()
gen ystar = `xsum' + ln(u/(1-u))
gen y = cond(ystar < -2, 1,     ///
cond(ystar <  0, 2,     ///
cond(ystar <  2, 3, 4)))
ologit y `x'
brant
return scalar p = r(p)
end

simulate p=r(p), reps(10000): sim, nx(1)
count if p < .05
matrix res = 1, r(N)/10000

foreach i of numlist 2/10 12(2)20 {
simulate p=r(p), reps(10000): sim, nx(`i')
count if p < .05
matrix res = res \ `i', r(N)/10000
}
matlist res
*-------------------- end simulation ----------------------
(For more on how to use examples/simulations I sent to the
Statalist, see http://home.fsw.vu.nl/m.buis/stata/exampleFAQ.html )

Hope this helps,
Maarten

-----------------------------------------
Maarten L. Buis
Department of Social Research Methodology
Vrije Universiteit Amsterdam
Boelelaan 1081
1081 HV Amsterdam
The Netherlands

Buitenveldertselaan 3 (Metropolitan), room Z434

+31 20 5986715

http://home.fsw.vu.nl/m.buis/
-----------------------------------------

___________________________________________________________
Yahoo! For Good helps you make a difference

http://uk.promotions.yahoo.com/forgood/
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```