Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.

# Re: st: Interpretation of Two-sample t test with equal variances?

 From Nick Cox To statalist@hsphsun2.harvard.edu Subject Re: st: Interpretation of Two-sample t test with equal variances? Date Wed, 20 Mar 2013 14:33:39 +0000

```In much the same spirit as earlier suggestions:

The mean ages given were 28.8 and 29.4 (presumably years) for the two
classes. That sounds like a difference without clinical significance,
although I am no clinician, not a woman, and not even significant.

However, it is also likely that the means are hiding important details
in the distributions. For example, I would expect skewed distributions
for mothers' ages -- and the skewness I might guess to differ between
the two modes of delivery. General knowledge underlines a range from
<<20 to >50 years.

Although I have much faith that Student's t test works well even if
you lie to it, skewness sounds like an area for investigation. My gut
instinct is that turning the problem round to make it a logit
regression on age makes much more sense. I would use a fractional
polynomial or cubic spline in age and always plot some smooth summary
of one or other fraction (e.g. fraction C or fraction V) versus age.

Nick

On Wed, Mar 20, 2013 at 2:02 PM, David Hoaglin <dchoaglin@gmail.com> wrote:
> Gwinyai,
>
> In your first message you posed the question of whether the mode of
> delivery depended on (or was related to) mother's age.  The logistic
> regression is an appropriate way to approach that question.  The
> output says that, in your data, the odds of a C/section increase with
> mother's age, but the rate of increase does not differ significantly
> from zero.  That is, the risk of a C/section is not related to
> mother's age.
>
> You may want to do a little diagnostic checking, to make sure that the
> logit model is a satisfactory summary of your data.  You could split
> the age range into intervals (with a reasonable total sample size in
> each interval), and calculate the percentage of C/sections in each
> category.  Does either group of mothers contain any unusually low or
> unusually high ages?
>
> I hope this discussion is helpful.
>
> David Hoaglin
>
> On Wed, Mar 20, 2013 at 1:04 AM, Gwinyai Masukume
> <parturitions@gmail.com> wrote:
>> Thank you Richard. Yes, I guess the t-test suggests the counter
>> intuitive though it probably won’t change things much.
>> How can I reverse the situation?
>>
>> I ran a logistic regression for binary outcomes as you suggested:
>> Essentially no significance is shown?
>>
>> . logit mode_delivery age
>>
>> Iteration 0:   log likelihood = -159.58665
>> Iteration 1:   log likelihood = -159.34203
>> Iteration 2:   log likelihood = -159.34197
>> Iteration 3:   log likelihood = -159.34197
>>
>> Logistic regression                               Number of obs   =        250
>>                                                   LR chi2(1)      =       0.49
>>                                                   Prob > chi2     =     0.4842
>> Log likelihood = -159.34197                       Pseudo R2       =     0.0015
>>
>> -------------------------------------------------------------------------------
>> mode_delivery |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
>> --------------+----------------------------------------------------------------
>>           age |   .0155454   .0222368     0.70   0.485     -.028038    .0591288
>>         _cons |  -1.133737   .6630978    -1.71   0.087    -2.433385    .1659111
>> -------------------------------------------------------------------------------
>>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/
```