Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: centred mean age

From   Thomas Norris <>
To   "" <>
Subject   RE: st: centred mean age
Date   Thu, 31 Jan 2013 10:30:31 +0000


Thank you for this.

Going back to your previous point regarding the instability may be due to over fitting with the cubic polynomial, when I was determining model fit, I was using AIC, BIC and residuals for the model and random effects. The cubic polynomial with 2 random age terms (AIC:-27621.8, BIC:-27539, sd residual random effects: 0.0594447, sd overall model resid: 0.0440447) 
was marginally better than the quadratic with 2 random age terms (AIC=-27611.16, BIC=-27535.92, sd residual random effects: 0.0594943, sd overall model resid: 0.0441144).

I know the figures are all slightly better with cubic, but are they 'better enough' for you to have concluded that the cubic was better?

Many thanks,


-----Original Message-----
From: [] On Behalf Of Nick Cox
Sent: 31 January 2013 09:58
Subject: Re: st: centred mean age

As said, this is just a matter of presentation, although powering very small numbers does strain the calculations. But the same point stands.

Your problem sounds like one where I might use log10() rather than
ln() for a mundane reason that it would make it a little easier to edit resulting graphs to show weights not log weights. Even you are using a log scale for fitting the reporting should make reference to weight.

In that vein log10(weight in grams) = 3 + log10(weight in kilograms), so a change of units is still pertinent when using logarithms.

It might be even better to use a generalised linear model with log link.


On Thu, Jan 31, 2013 at 9:44 AM, Thomas Norris <> wrote:

> Weight is actually on the log scale ( ln(weight) ), not kilograms, as it showed increasing variability with age. This wouldn't have an effect, would it?

Nick Cox

> It is difficult to give really good advice without being able to look at the data, but  it seems unusual to me that a cubic polynomial outperforms competitors. Independently of your main issues I'd advise a change of units of measurement if only to ease presentation (e.g.
> kilograms to grams).
> Very generally, instability of coefficients often signals possible over-fitting.

On Thu, Jan 31, 2013 at 9:13 AM, Thomas Norris <> wrote:

>> Thank you very much for your advice. If I may clarify just so I can progress without doubt. I found that the best fitting multilevel model for my prenatal weight dataset was a cubic polynomial (tried fracpolys and spline). I then decided to centre the age term as it is not intuitive to have an intercept at 0 as, in prenatal life,  there should be nothing at zero.
>> I have created a dummy variable for ethnicity, to see if there are differences between two ethnic groups, and interacted this with age (pakage, pakage2,pakeage3) and centred age (in the centred model).
>> The coefficients in the uncentred model were:
>> Age: 0.256372
>> Age2: -0.0009669
>> Age3= -0.0000291
>> Pak= -0.5843112
>> Pakage= 0.0617149
>> Pakage2= -0.0021505
>> Pakage3= 0.0000234
>> In the uncentred model:
>> Age: 0.1062287
>> Age2: -0.0037464
>> Age3= -0.0000291
>> Pak= -0.0427686
>> Pakage= -0.0039254
>> Pakage2= 0.0000899
>> Pakage3= 0.0000234
>> As people have since told me, it is fine that the coefficients change value after the centreing, but the interactions between age and age2 and ethnicity have switched from positive to negative and vice versa, after centreing. Is this what one would expect?

>>> I was under the impression that the age coefficients in a centred model shouldn't be different to an uncentred model though, and mine change.
>>> Is this change therefore ok?

Nick Cox

>>> Whether or not it helps in your model, I see no problem in what you 
>>> describe. It's the way that linear, quadratic and cubic terms work 
>>> together in a model that's important.
>>> All that said, there are quite possibly better ways of doing what 
>>> you want, such as cubic splines or fractional polynomials, which are 
>>> well supported in Stata.

On Wed, Jan 30, 2013 at 7:08 PM, Thomas Norris <> wrote:

>>>> I am having trouble with centering my independent variable (age) in a cubic polynomial.
>>>> I have generated the centred age by using gen centrage= age-r(mean) 
>>>> and then to get the centred quadratic and cubic I simple raise 
>>>> centrage to ^2 and ^3 respectively (gen centrage2= centrage^2)(gen
>>>> gentrage3=centrage^3)
>>>> However, the negative centred age terms (ie those smaller than the mean) become positive when squaring them, which is what is mathematically correct, but it doesn't help my models.
>>>> If for example the mean was 30 weeks and I had 2 separate obs, one at 25 weeks and one at 35 weeks, the centred age would be -5 and 5, but the centred age^2 are both 25.

*   For searches and help try:

*   For searches and help try:

© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index