Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: ln transform and box cox


From   Rebecca Pope <rebecca.a.pope@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: ln transform and box cox
Date   Thu, 7 Mar 2013 08:45:22 -0600

David's advice about not modeling all non-linear relationships as
quadratic is good advice. However, I want to make sure that my point
earlier about -fracpoly- was not misunderstood.

What I'm not saying: You must use a quadratic function. Rereading my
first post, I realize that the throw-away sentence about which is more
common may have sounded like advice to default to that. It should not
be taken as such. That's what loose language gets you. I'm sorry.

What I am saying: You do not need to go mining through your data with
a host of functions and see which one happens to fit best. For a
variety of reasons I think this is bad practice, but my opinion on
that doesn't matter much. More importantly, if you understand the
general relationship between your variables, you should have an idea
of some functions that are likely candidates. I offered 2 functions
that I have seen applied to age. Jay has suggested splines, which I
(clearly) didn't think of, but is an excellent suggestion. There may
well be others that suit fetal growth, however, it seems like there
should be a self-limiting set that are reasonable and easily
interpretable. Others may not value that last point. If you have a
limited set of functions selected based on theory/prior knowledge/just
looked at the data, it is possible to select the best one yourself and
hence no need to artificially constrain yourself to Stata functions
that will work with -fracpoly-.

On Wed, Mar 6, 2013 at 4:31 PM, David Hoaglin <dchoaglin@gmail.com> wrote:
> Also, you should examine the choice of functional form for age in the
> context of the model that contains all the explanatory variables that
> you plan to use. The adjustments for those explanatory variables may
> affect the apparent pattern of the relation of the dependent variable
> to age.

Yes. And this also brings us back to Tom's concern about increasing
variance as the subjects aged. This is, I think, motivated by a fear
that non-constant Var(weight) indicates that the error variance is not
constant over time (i.e. not homoskedastic). My understanding is that
this concern, unlike something like the functional form, can't be
assessed in the absence of conditioning on the explanatory variables.

As Anthony noted, one of the wonderful things about the list as that
you get great discussions around a topic, not just focused help-line
responses, so I'm going to try to take advantage of that. Given the
set up of the study, is age not analagous to time? If so, is it not
true then that Var(weight) should be, by definition, increasing with
age? So, one couldn't directly draw conclusions about violated
assumptions just looking at Var(weight) plotted against age.
Specifically, time (=age) needs to be explicitly incorporated into the
random part of the model. As a skeleton:

xtmixed weight age xvars || subject: age

where Tom would add whatever appropriate functional form of age,
options, and additional variables (xvars) were needed.

Regards,
Rebecca

On Wed, Mar 6, 2013 at 7:52 PM, JVerkuilen (Gmail)
<jvverkuilen@gmail.com> wrote:
> On Wed, Mar 6, 2013 at 7:15 PM, Anthony Fulginiti <fulginit@usc.edu> wrote:
>> Although I did not post the initial question, I welcome the helpful book reference.  I have been using the Singer and Willett book, "Applied Longitudinal Data Analysis" but looked at Fitzmaurice website
>> and found it to be another great source.  Related to the post:  I followed a similar modeling strategy as Tom (specifying the polynomial function of time that best fit the data first and subsequently adding
>> variables of central interest and controls) but plan to reexamine the issue with the explanatory vars in the model.
>
> Regression splines may be a good way to go about getting a flexible
> specification for age. If you know that there are a few key ages, put
> knots there.
>
>
> --
> JVVerkuilen, PhD
> jvverkuilen@gmail.com
>
> "It is like a finger pointing away to the moon. Do not concentrate on
> the finger or you will miss all that heavenly glory." --Bruce Lee,
> Enter the Dragon (1973)
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index