Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# st: RE: RE: two sample test under generalized Behrens-Fisher conditions

 From Nick Cox To "'statalist@hsphsun2.harvard.edu'" Subject st: RE: RE: two sample test under generalized Behrens-Fisher conditions Date Tue, 14 Dec 2010 15:16:34 +0000

```I see the problem. I couldn't (wouldn't) fit -glm- in an introductory course either.

In similar circumstances I usually assert that t tests work well even if the assumptions are not well satisfied. This is an idea that goes back at least to G.E.P. Box in Biometrika 1953:

Box, G.E.P. 1953. Non-normality and tests on variances. Biometrika 40: 318-35.

Nick
n.j.cox@durham.ac.uk

Airey, David C

I was looking for "stark cookbooky" solutions for a (too) short intro course that will not address GLM. But transformations they will be told about, and the last time I taught this course, your help file about transformations was required reading. Thanks for that citation. Looks like a good book.

Nick Cox

> In this kind of territory, I would always
>
> 1. Check out what is said in Rupert G. Miller, Beyond ANOVA. See on the CRC Press reissue
> <
> http://www.crcpress.com/utility_search/search_results.jsf?conversationId=250169
> >
>
> Your library may hold a copy of the Wiley original.
>
> 2. Be wary of the stark cookbooky alternative: data if normal, ranks otherwise. What happened to the idea of transformations or link functions? How do you decide when the data are approximately normal any way?
>
> Here is an example of a different approach. In the auto data, -mpg- given -foreign- is neither normal nor heteroscedastic. But these are secondary issues. Consider this set of results. In each -family(normal)- is implied.
>
> foreach v in "power 1" "power 0.5" "log" "power -0.5" "power -1" {
> 	qui glm mpg foreign, link(`v')
> 	mat b = e(b)
> 	mat V = e(V)
> 	di "`v'"    "{col 20}" %3.2f   b[1,1] / sqrt(V[1,1])
> }
>
> power 1            3.63
> power 0.5          3.70
> log                3.75
> power -0.5         -3.78
> power -1           -3.80
>
> The change of sign of what -glm- calls the z statistic is an expected side-effect of changing to inverse transformations. More importantly, z changes only very slowly and the collective set of results points to the idea that 1/mpg is a more appropriate scale than mpg on which to test for differences. This of course matches basic science.
>
> Generalized linear models are nearly 40 years old as a family. When are they going to receive the recognition they deserve?

Airey, David C

>> I was reading a little about what to do when you have both unequal variance and non-normality. Neither the equal variance t-test nor the Mann-Whitney U test are best when you want to interpret the difference in means or medians.
>>
>> I had found the Stata command -fprank-, but it turns out this robust ranks test doesn't escape a symmetry assumption to interpret the location difference.
>>
>> I found that some recommend using Welch's t-test on the ranked data (Zimmerman and Zumbo (1993) Rank transformations and the power of the Student's t test and the Welch t' test for non-normal populations with unequal variances. Canadian Journal of Experimental Psychology 47:3, 523-539).
>>
>> This appears easy and satisfying solution to teach with: always use unequal variances t-test and use ranks if the data are also not normal.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```