Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: Appropriate modelling - testing which set of exposures are more important

From	Amal Khanolkar <[email protected]>
To	"[email protected]" <[email protected]>
Subject	RE: st: Appropriate modelling - testing which set of exposures are more important
Date	Fri, 28 Sep 2012 22:52:52 +0000

Dear Martin and David,

Thank you so much for your detailed answers - they were very helpful!

I decided to try both the options suggested by you:

1. Comparing the R-squared values of three models: one with ethnicity only, second with SEP only and the third with both ethnicity and SEP. 

Unfortunately the 'fully adjusted model' with all confounders explains only ~13% of variation in my outcome for all three models above - which suggests that ethnicity and SEP individually do not explain more than the other. Please correct e on this.

2. Using the sheaf coefficient -  (I read you paper in the Stata journal -  thanks so much for this!) -  I get the following. I hope I specified the dummies in the right way:

The first model:

 xi: regress wght_gain i.mom_race2 age_mom i.parity gestcalc i.cigs_befx i.gestdb i.gesthy i.MBMI ht_cm i.edu_mom i.marriedx i.edu_dad if plural==1

(Above my main sets of exposures are 'mom_race2' (ethnicity), 'edu_mom' (education) and 'marriedx' (civil status). The last two, education and marriage together are latent variables for SEP.

The second model:

 sheafcoef, latent(mom_race2:_Imom_race2* ; edu_mom:_Iedu_mom* ; marriedx:_Imarriedx*) /// 
 eform post

As my categorical variables already had dummmies, I just included them as above and skipped the first steps in the examples available on sheaf coefficient modelling.

. sheafcoef, latent(mom_race2:_Imom_race2* ; edu_mom:_Iedu_mom* ; marriedx:_Imarriedx*) ///
>    eform post
---------------------------------------------------------------------------------
      wght_gain |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
main            |
    mom_race2_e |    1.60668   .0234559    68.50   0.000     1.560708    1.652653
      edu_mom_e |   1.317605   .0257826    51.10   0.000     1.267073    1.368138
     marriedx_e |   1.164428   .0177144    65.73   0.000     1.129709    1.199148
      age_mom_e |   .9492895   .0028365   334.67   0.000       .94373     .954849
   _Iparity_1_e |   .5225553   .0181558    28.78   0.000     .4869706    .5581401
   _Iparity_2_e |   .4679537   .0191524    24.43   0.000     .4304157    .5054917
   _Iparity_3_e |   .4039802   .0170978    23.63   0.000     .3704692    .4374912
     gestcalc_e |   1.310362   .0077922   168.16   0.000     1.295089    1.325634
_Icigs_befx_1_e |   1.622747   .0767617    21.14   0.000     1.472297    1.773197
   _Igestdb_1_e |   .2832177   .0173003    16.37   0.000     .2493096    .3171258
   _Igesthy_1_e |   7.343815   .4643304    15.82   0.000     6.433744    8.253886
     _IMBMI_2_e |   .2390739   .0077441    30.87   0.000     .2238957    .2542521
     _IMBMI_3_e |   .0113847   .0004101    27.76   0.000     .0105809    .0121884
        ht_cm_e |   1.068248   .0021222   503.37   0.000     1.064089    1.072407
  _Iedu_dad_2_e |   1.244338   .0601069    20.70   0.000     1.126531    1.362145
  _Iedu_dad_3_e |    1.67542   .0894538    18.73   0.000     1.500094    1.850747
  _Iedu_dad_4_e |   1.277575   .0843029    15.15   0.000     1.112344    1.442805
  _Iedu_dad_5_e |   1.206204   .0735752    16.39   0.000     1.061999    1.350408
  _Iedu_dad_6_e |   1.198912   .0855525    14.01   0.000     1.031232    1.366592
        _cons_e |   .0058087   .0023308     2.49   0.013     .0012405     .010377
----------------+----------------------------------------------------------------
on_mom_race2    |
  _Imom_race2_2 |  -2.949192   .1347202   -21.89   0.000    -3.213238   -2.685145
  _Imom_race2_3 |  -.0324167   .2241983    -0.14   0.885    -.4718373    .4070039
  _Imom_race2_4 |  -1.895132   .0896311   -21.14   0.000    -2.070805   -1.719458
  _Imom_race2_5 |  -2.362808    .074601   -31.67   0.000    -2.509023   -2.216592
  _Imom_race2_6 |  -2.803899   .2082657   -13.46   0.000    -3.212092   -2.395706
  _Imom_race2_7 |  -1.299902   .2959342    -4.39   0.000    -1.879922   -.7198813
----------------+----------------------------------------------------------------
on_edu_mom      |
    _Iedu_mom_2 |    1.53707    .156318     9.83   0.000     1.230692    1.843448
    _Iedu_mom_3 |   2.650745   .1205822    21.98   0.000     2.414409    2.887082
    _Iedu_mom_4 |   2.237552   .1753927    12.76   0.000     1.893788    2.581315
    _Iedu_mom_5 |   2.880519   .0884774    32.56   0.000     2.707107    3.053932
    _Iedu_mom_6 |   2.978541   .1417484    21.01   0.000      2.70072    3.256363
----------------+----------------------------------------------------------------
on_marriedx     |
   _Imarriedx_2 |   2.406038   .0064151   375.06   0.000     2.393464    2.418611
   _Imarriedx_3 |   1.836238   3.754316     0.49   0.625    -5.522086    9.194563
---------------------------------------------------------------------------------

- Just to make sure that I interpret the findings correctly:

Here again, we see that the effects of ethnicity, education and marriage do not differ that much from each other. But this is the 'overall' effect of ethnicity, education and marriage on the outcome. It seems like (if I understand correctly) the actual comparable effects between different categories of ethnicity and education and marriage on the outcome are not comparable here.  (I'm not very sure how to correctly interpret the values in the 'on_mom_race2' and 'on_edu_mom' rows). Could you help me with these?

Thanks!

Regards,

/A.

Amal Khanolkar, PhD candidate,

________________________________________
From: [email protected] [[email protected]] on behalf of David Hoaglin [[email protected]]
Sent: 28 September 2012 14:08
To: [email protected]
Subject: Re: st: Appropriate modelling - testing which set of exposures are more important

Dear Amal,

As Maarten Buis explains, a thorough answer to your question may not
be simple.  I need to study the article that he cited (submitted to
the Stata Journal).  A few less-sophisticed comments may be helpful.

The coefficients in the model in Step 3 tell you about the effects of
SEP on the outcome, adjusting for ethnicity (and the confounders) AND
the effects of ethnicity on the outcome, adjusting for SEP (and the
confounders).  SEP and ethnicity are on an equal footing in that
model.

The predictors in Step 4 should include the "main effects" of
ethnicity and SEP, not just the interaction effects.  You can use the
## operator on those factor variables.

If those interaction effects (in the revised Step 4) are statistically
significant, you will need to interpret the effects of SEP separately
for each category of ethnicity and the effects of ethnicity separately
for each category of SEP.  A transformation of the outcome variable
(or a suitable choice of link function) may remove or reduce
interactions if they are present.

If interactions are not an issue, a simple (simplistic?) approach to
assessing the relative "importance" of ethnicity and SEP would fit the
model in Step 2 and the corresponding model that contains SEP and not
ethnicity, and then look at the difference in R^2 between the model of
Step 3 and each of those two models.  That is one way of assessing the
contribution of SEP after accounting for ethnicity and the
contribution of ethnicity after accounting for SEP.  It may be
instructive to compare the values of R^2 for the two Step 2 models
against the R^2 of the model that contains neither ethnicity nor SEP.

It may be important to understand the relations between the potential
confounders and ethnicity and SEP --- in a diagram for the causal
model.

If your data are observational, it is more accurate to say "adjusting
for" instead of "controlling for."  In an observational study, the
potential confounders are not actually controlled.

David Hoaglin

On Thu, Sep 27, 2012 at 1:46 PM, Amal Khanolkar <[email protected]> wrote:
> Hello all,
>
> I need some advice on the following approach:
>
> I have two main exposures; maternal ethnicity and maternal socioeconomic position (SEP).
>
> I want to test which of the above two exposures are more important in determining maternal pregnancy outcomes.
>
> 1. I plan to use linear regression, as my outcome of interest is continuous.
> 2. Initially, the first model will test the effect of ethnicity on the outcome, controlling for potential confounders as follows:
>
> xi: regress outcome i.ethnicity confounder1 confounder2 i.confounder3
>
> 3. In the next step, I introduce the second main exposure, maternal SEP:
>
> xi: regress outcome i.ethnicity confounder1 confounder2 i.confounder3 i.SEP
>
> 4. I test for an interaction as follows:
>
> xi: regress outcome i.ethnicity*i.SEP confounder1 confounder2 i.confounder3
>
> Questions: If the effects of ethnicity on my outcome of interest change from step2 to step3, controlling for the same confounders in both models, is this enough evidence of one exposure being more important than the other? (I assume, this isn't completely right, as in essense the model in step 3 is the effect of SEP on the otcome adjusting for ethnicity). But I hope the model with the interaction test solves this to some extent, as I will be able to see if socieconomically disadvantaged mothers of certain ethnicites  have a worse outcome compared to other disadvantaged mothers belonging to other ethnic groups.
>
> If there are any better ways to improve the above approache - please let me know.
>
> Regards,
>
> /Amal.

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Appropriate modelling - testing which set of exposures are more important
  - From: Amal Khanolkar <[email protected]>
- Re: st: Appropriate modelling - testing which set of exposures are more important
  - From: David Hoaglin <[email protected]>

Prev by Date: Re: st: Cox model with unobserved heterogeneity
Next by Date: Re: st: comparing 25th percentile survival time between two race groups
Previous by thread: Re: st: Appropriate modelling - testing which set of exposures are more important
Next by thread: st: Post-program tempfiles or a temp directory?
Index(es):
- Date
- Thread