Sorry, but this to me is just a restatement of
your previous posting, and addresses none of
the points I raised.
That aside,
I don't understand how a quadratic function can
have powers 3 3. Cubics in my experience are never
appropriate for global fits unless there are clear
dimensional grounds for using them, which seems unlikely
here.
Nick
n.j.cox@durham.ac.uk
Suzy
> Thanks for your response Nick. In a nutshell, age is not
> linear in the
> logit. I'm using the fracpoly command to identify the best functional
> form for age in the full model. The result returned from
> Fracpoly was a
> quadratic function with powers 3 3 (which also looks good with
> fracplot). However, when I further assessed the model using
> the Boxtid
> command, the results with the new age transformation - the
> results were
> not favorable (the Ho was rejected). When I transformed another
> continuous variable in the same full logistic model (quadratic with
> powers 1 2 by Fracpoly), the Boxtid results were favorable,
> all graphs
> looked very good, and the diagnostics were good (linktest,
> etc...). I'm
> trying to understand why my results aren't consistent (Fracpoly and
> Boxtid) with the age variable, but is with all other
> continuous variables?
>
> Nick Cox wrote:
>
> >I am not clear what you think Statalist members know
> >that can help you here. For example, the field
> >in which you are working, what the response variable
> >-dmcat- means, and what other predictors there may be are all
> >hidden from view, so the chance of giving opinions
> >drawing on substantive expertise is zero. Otherwise
> >put, you appear to be assuming that the choices
> >here can all be made on purely statistical criteria,
> >an attitude which always worries me greatly.
> >
> >What I have observed, as a kind of anthropologist of
> >statistical science, is that age plays very different
> >roles in different fields. Economists often seem
> >to find that a quadratic in age does very nicely,
> >whereas biostatisticians often seem to need
> >more complicated representations, which seems
> >perfectly plausible given the complexities of
> >childhood, adolescence, etc.
> >
> >Either way, -fracpoly- like other programs has
> >no inbuilt sensor (or censor) selecting theoretically or
> >scientifically sensible functional forms. So,
> >I suggest that you plot the curve implied against
> >age and think about it as something that needs justification
> >or interpretation independently from the data.
> >
> >Nick
> >n.j.cox@durham.ac.uk
> >
> >Suzy
> >
> >
> >
> >>I am trying to transform one final continuous independent
> >>variable (age)
> >>in a logistic regression model. I've tried what I know that's
> >>available
> >>via Stata. For example, I used the fracpoly command and the best
> >>transformation was a second order polynomial with powers 3 3.
> >>
> >>Fractional polynomial model comparisons:
> >>---------------------------------------------------------------
> >>age df Deviance Gain P(term) Powers
> >>---------------------------------------------------------------
> >>Not in model 0 2098.129 -- --
> >>Linear 1 1834.224 0.000 0.000 1
> >>m = 1 2 1805.957 28.267 0.000 -1
> >>m = 2 4 1791.327 42.897 0.001 3 3
> >>m = 3 6 1790.526 43.699 0.670 -2 3 3
> >>m = 4 8 1788.431 45.793 0.351 -2 -2 3 3
> >>---------------------------------------------------------------
> >>
> >>
> >>I then used fracgen to generate the new age variables - age_1
> >>and age_2.
> >>
> >>fracgen age 3 3
> >>-> gen double age_1 = X^3
> >>-> gen double age_2 = X^3*ln(X)
> >> (where: X = (age+1)/10)
> >>
> >>
> >>
> >>
> >>
> >>The coefficients for age_1 and age_2 from the full logistic
> >>regression
> >>model:
> >>--------------------------------------------------------------
> >>----------------
> >> Y var | Odds Ratio Std. Err. z P>|z|
> [95% Conf.
> >>Interval]
> >>-------------+------------------------------------------------
> >>----------------
> >> age_1 | 1.087994 .0093302 9.83 0.000
> 1.06986
> >>1.106436
> >> age_2 | .9644247 .0037538 -9.31 0.000
> .9570955
> >>.9718101
> >>
> >>
> >>However the boxtid command rejected the null for both age_1
> >>and age_2....
> >>
> >> age_1 | .0100805 .0007172 14.055 Nonlin. dev.
> >>24.646 (P
> >>= 0.000)
> >> p1 | .0535714 .2122906 0.252
> >>--------------------------------------------------------------
> >>----------------
> >> age_2 | -.0021756 .0004885 -4.453 Nonlin. dev.
> >>7.894 (P
> >>= 0.005)
> >> p1 | 3.864227 2.133377 1.811
> >>
> >>
> >>In all other respects, the preliminary diagnostics look good...
> >>
> >>Linktest:
> >>--------------------------------------------------------------
> >>----------------
> >> dmcat | Coef. Std. Err. z P>|z|
> [95% Conf.
> >>Interval]
> >>-------------+------------------------------------------------
> >>----------------
> >> _hat | .8900851 .1153855 7.71 0.000
> .6639337
> >>1.116236
> >> _hatsq | -.0319886 .0307101 -1.04 0.298
> -.0921793
> >>.0282022
> >> _cons | -.0450195 .1069617 -0.42 0.674
> -.2546606
> >>.1646215
> >>--------------------------------------------------------------
> >>----------------
> >> lroc
> >>
> >>Logistic model for dmcat
> >>
> >>number of observations = 3354
> >>area under ROC curve = 0.8647
> >>
> >>etc...etc...etc...
> >>
> >>My question is should I be concerned with the results of the Boxtid
> >>command? Is there something I've done incorrectly or
> something else I
> >>can do/should do?
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/